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N the past few years, statistics in the United States have come to 

be used as determinants of private and publie actions affecting the 
fortunes of millions of our fellow citizens. As such, they have captured 
the interest and have become the concern of many who were previously 
unaware of the existence of statistics. 

This development has come so suddenly—and spread into во many 
fields of activity—that the statistical profession as a whole seems al- 
most unaware of what has been happening. As a profession, we are 
scarcely prepared, and certainly not organized, to meet the serious 
responsibilities placed upon us by these new uses of statistics. 

It is to these new responsibilities, and the challenge which they 
present, that I wish to direct your attention this evening. I propose, 
first, to review the new ways in which statistics are now used; then to 
discuss the high standards of statistical competence and integrity 
which these new uses require us to maintain; the new phases of sta- 
tistical training which ‘they suggest; and the public responsibilities 
which they impose upon statistical agencies and statisticians. 

I need scarcely remind this audience that statistics have come to be 
one of the great descriptive and analytical tools of modern industrial 
society, comparable to the other new tools of science. The varied pro- 
gram of this Association at this Annual Meeting is evidence of the 
fact that statistics are applied today not merely to describe the be- 
havior of man as а social and economie animal, but also to describe 
plant and animal life, the weather, and, indeed, the shape of the uni- 
verse. 

Statistics of a sort can, of course, be traced back to ancient times, 
but they have flowered since the industrial revolution. Beginning in 


* Presidential Address at the 112th Annual Meeting of the American Statistical Association, Chi- 
cago, December 28, 1952. E 
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the 19th century, statistical records were developed to describe the 
society of that era, and to throw light on its economie and social 
problems. No doubt they influenced the course of men's thinking then, 
and even, in some instances, may have led to.new policies and new 
laws; but primarily their uses were descriptive. 

Increasingly, in the 20th century, and especially since World War I, 
statistics have been used to settle problems, and to determine courses 
of action. In private enterprise, quality control tests now change the 
production lines of industrial enterprises. New products are developed 
and tested by statistical means. Scientific experiments turn upon sta- 
tistics. The management of industry today employs statistics in many 
ways as a guide tó internal operations. 

Their use in the sphere of public economic policy-making has ex- 
panded almost unbelievably. Beginning with the 1920's, we have seen 
the development of a whole battery of economic statistics, collected in 
great detail, by industry, area, and product, and summarized in the 
familiar indexes and aggregates designed to measure changes in the 
economie cycle, e.g., the gross national product; the indexes of indus- 
trial production; employment and unemployment; financial statistics; 
price data of many kinds; wages; the volume of trade and commerce— 
and so on through a long catalogue. As time has gone on, these reports 
have been compiled in more and more detail, with greater frequency 
and improved quality. 

These statistics come from many sources—from individual business 
firms and industries and from the professions and their associations} 
from labor organizations; from schools and colleges; and from private 
research organization’ of many kinds. Because they are often nation- 
wide and necessarily costly, they have been compiled to an increasing 
extent in the past two decades by the Federal Government. Some of 
these statistical producing agencies are old, tried, and experienced, and 
their standards of performance are well known; others are virtually 
unknown and untested. No license is required to produce statistics. 
These agencies produce statistics for widely differing purposes and, con- 
sequently, some of these summary statistics are finer, more precise 
instruments than others. But for the most part, these summary eco- 
nomic indicators were intended primarily as analytical tools, for quite 
broad economic analysis, 5 1 
mulis үн they have long been used by the Congress and by 

cu gencies of the government in making laws and determining 
policies, ав well as by private firms and individuals in making their 
. decisions. As a rule, they have been only one of many factors con- 
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sidered in reaching a decision, and since they were designed as ар- 
proximations, they are suitable for just such general purposes as these, 
where a fairly wide tolerance is acceptable. 

World War II, however, made sudden new demands upon statistics 
for purposes of regulation and control. (There had been some other 
instances of this kind previously, but they had noifbeen so important.) 
АП at once, detailed economie statistics were in demand for determina- 
tion of allocations of materials or of shipping space, or for price or wage 
regulation, or for other control purposes. Those statistics at hand were 
seized upon, whether or not they were really suitable for control pur- 
poses. If no statistics were available, a special-purpose survey was 
usually hurriedly devised, often with sampling errdrs of undetermined 
magnitude. Here, a mistake in classification, in tabulation, or, most 


important, in the framing of the questions or in the interpretation of | 


the data, had immediate repercussions upon business firms, their em- 
ployees, and the defense program itself. The statistical profession, it is 
clear, was quite unprepared for the war emergency. Yet those special 
wartime statistics were discontinued, with considerable enthusiasm, 
as soon as controls ended, only to be partially and hastily reassembled 
for the Defense Program in 1950. Clearly, economists and statisticians 
have still not provided the publig with a design for an adequate, 
integrated economic intelligence system, based upon sound statistical 
standards for use in such emergencies. We have not even provided a 
first blueprint of such a plan. 

In still another field, statistics have been used more and more as 
evidence admissible in court. Both detailed studies relating to particu- 
lar firms or industries and broad economie inficators appear:in at- 
torneys' briefs and exhibits. Statistics are also used by the Federal 
Trade Commission and thetDepartment of Justice in deciding whether 
to prosecute a cfise. The so-called concentratipn ratios employed in 
anti-trust prosecutions are an example. 

Still more recently, statistics have come to be uséd as automatic 
regulators or governors, in the engineer’s use of tHat term. They have 
been written into law as “trigger figures,” determining automatically 
whether certain actions take place. They have been written into con- 
tracts between private firms and individuals, and between governments 
and private individuals. They have thus come to determine what prices 
millions of farmers get for their crops, or other millions of workers get 
for their day’s work, or the amount of taxes a business enterprise must 
рау, or what actual dollar price a producing company receives for a \ 
new generator. А rea 


\ 
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Consider for à moment some of these statistical governors. We are 
familiar with them, but not necessarily with their new uses. The oldest 
in the United States is the Census, under the Constitutional provision 
whereby the,number of representatives for each State in the House of 
Representatives shall be determined by the Congress on the basis of a 
Censtts of the Population taken once every 10 years. Thus, since the 
Census of 1950, more representatives come from California and fewer 
from Pennsylvania. These Census calculations are generally accepted, 
since this is their primary purpose. 

Then, in point of time, there is the parity index, another specially 
constructed statistical governor. It was devised by Congress after long 
and bitter agricultural depression, as a yardstick against which to 
measure farm prices or the farmer’s fair share of the national product 
in comparison with the industry from which he buys and to which he 
sells. This yardstick—so complex that few economists and too few 
farmers understand it—is a ratio of indexes of prices paid by farmers 
to prices received by farmers, hooked back to a pre-World War I base 
of 1910-14. Today, price supports covering some 40 percent of the sales 
value of farm commodities are based upon the parity index and parity 
prices. Congress, having fixed the base period of 1910-14 by law, later 
amended the Act to add farm wages to the index of prices farmers pay, 
and to more or less modernize the price relationships as between the 
individual farm commodities. But for the most part, the parity index 
continues, with occasional quiet revisions by its custodians in the Bu- 
reau of Agricultural Economics, as one of the main factors governing 
the receipts of millions of farmers. 

Then there are several statistical measures of which the Bureau of 
Labor Statistics is custodian, which have similar uses as “triggers” or 
governors, only one of which was especially tompiled for such purposes. 

To take the most recent example first: The Burehu's statistics on 
housing starts have just been used, under an Act passed by the Con- 
gress in the suminer of 1952, as a determinant of the time at which the 
Federal Reserve limitations on housing credit under the so-called 

Regulation x” Were suspended. These statisties were originally de- 
signed asa descriptive Measure of the approximate number of new 
dwellings started in non-farm areas. They were then not calculated on 
the basis of a seasonally adjusted annual rate, as called for in the Act 
passed by the Congress, and so a difficult and necessarily rough sea- 
sonal calculation had to be made. Moreover, the preliminary form in 
which they are compiled has sizable potential errors for three months, 
until final returns come in. Yet the Act provided that they were to be 
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used as “trigger figures" to suspend these particular credit regulations. 
It is fortunate that the level of home-building activity in the summer 
and early autumn of 1952, when this Act became effective, fell so far 
short of the seasonally adjusted annual rate of housing stgrts specified 
in the Act that the decision to suspend Regulatiqn X could be made 
with statistical confidence. It could easily have Ъбеп otherwise. e 

A similar “trigger” use of the Consumers’ Price Index was considered 
when the Defense Production Act of 1950 was being debated in the 
Congress. In fact, this particular provision once passed the House of 
Representatives. It provided that price and wage controls during the 
defense emergency were to go into effect when the Consumers' Price 
Index had advanced by 5 percent from a given date. However, it was 
pointed out to members of the Congress that the Consumers’ Price 
Index, being made up of retail prices, would move late and slowly, 
and that prices of raw materials and industrial products might have 
sky-rocketed before the Consumers’ Price Index had moved by the sug- 
gested amount. By then it would have been too late for effective 
price control measures. Consequently, this particular proposal was 
dropped. 

Тһе venerable Wholesale Price Index of the Bureau of Labor Sta- 
tisties has also been extensively used аз an escalator in business con- 
tracts during and since World War II. When this Index was revised 
early in 1952, % was found that thousands of private contracts involv- 
ing billions of dollars contained escalator clauses by which the amount, 
to be paid by the purchaser varied automatically with the wholesale 
index or some component of it. These contracts include government 
contracts for such equipment as ships, which r&quire several years to 
construct; many private contracts for heavy equipment or mainte- 
nance, such as generators er elevator maintenance; long-term leases on 
commercial and industrial properties; public utility rates in one State; 
and a variety of other contracts. 

In certain of these contracts in which the materials component varies 
with wholesale priees of materials, the wage confponent escalates on 
average hourly earnings, as compiled by the Bureau of Labor Statistics 
in connection with its nation-wide monthly survey of manufacturing 
employment, hours and earnings. This compilation of earnings, while 
it is very comprehensive, is not intended to be accurate to the last 
penny. 

Another statistical series in the Bureau of Labor Statistics used for 
escalation purposes.is the LIFO (Last In First Out) price index, espe- 
cially compiled at private expense for the use of department stores in 


6 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1953 


calculating allowable changes in inventory values for tax purposes, 
under an agreement with the Bureau of Internal Revenue. 

Then there is the well-known use of the Consumers’ Price Index in 
wage contracts, under which wages change immediately and automati- 
cally at regular intervals, usually quarterly, whenever there is a speci- 
fied change in the iudex. There are also many instances of agreements 
in which wages or other income payments are subject to renegotiation 
or reconsideration upon a specified change in the index. In addition to 
wage contracts, the Consumers’ Price Index is used in many other ways 
as an escalator—for example, in leases for residential buildings, for pay- 
ments for family maintenance, such as relief payments and alimony 
payments. These ure private contracts negotiated by industry and 
labor and by private individuals, usually without consultation with 
the agency which issues the figures. Some of them are fairly long-term, 
like leases for residential buildings, or in the wage field, like the General 
Motors five-year contract. Others are for only a year or two. 

In addition to these governmental statistics, there are a number of 
privately compiled yardsticks which serve similar purposes. 

Review, now, the effect that these statistics have. A State gains or 
loses a Congressman; farmers do or do not get loans on their crops; 
an industrial producer does or does not get a higher price for his prod- 
ucts; a tax bill diminishes or increases; wages go up or down. It is no 
exaggeration to talk of hundreds of millions or even billions of dollars 
being involved. Let me be specific: Take the case of wage contracts 
tied to the Consumers’ Price Index. 

In most of these wage contracts, a change of 1 point or 1.14 points 
on the present index, which in 1952 has varied around 190 percent of 
the 1935-39 average, means a change of one cent an hour in wage rates. 
An increase of one cent an hour for a 40-hour week for 50 weeks out of 
the year for 3,500,000 workers known, at a minimum, to be under 
such contracts, means $70,000,000. This does not count millions of 
other employees:whose wages or salaries may follow suit along with the 
index. Testimony before a Subcommittee of the House Committee on 
whic held на ара авдаг ве орал of Congressman Steed, 
eee E on the Consumers’ Price Index in 1951, indicated 

cent an hour totaled $8,000,000 a year for the General Motors 


‚ Corporation alone, and $30,000,000 for the railroads, which have over 


1,000,000 employees under such contracts. The tables in these wage 
contracts are set up in brackets of 1 or 1.14 points, by tenths of a point 


1 House Committee on Education and Labor, x 
Congress, 1st Session, Subeommittes Report оне to Study Consumers’ Price Index, 82d 
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on the index. Thus, at the margins of those wage brackets, if the ifidex 
rises or falls by one-tenth, hundreds of millions of dollars are immedi- 
ately involved. 

Uses of statistics such as these imply a faith in statistics of which 
statisticians can be very proud. If statistics and statistical®agencies in 
the United States were not trusted or had not earned а right to public 
confidence, all this would never have happened. We accept the acco- 
lade, but like the Lord Chancellor in Iolanthe we say, a little wryly, 

“But though the compliment implied 
Inflates me with legitimate pride, 

It nevertheless can't be denied 

That it has its inconvenient side.” 


Such responsibility as this is very sobering. 

These statistics were usually ereated, as I have said, for descriptive 
and analytical purposes, and machinery for their production was geared 
to more or less leisurely uses by people who were assumed to be reason- 
ably familiar with their construetion and their technical limitations. 
'The checking devices, the revision programs, the publicity plans, were 
not laid out with а view to such awesome uses as these, by a wide, often 
unknown, and statistically unsophisticated public. The producing 
agencies have had to change their practices; to speed up their calcula- 
tions; to guard their results from “lêaks”; and greatly to expand their 
public consulting machinery. As never before, they ‘are operating in a 
goldfish bowl. They are being advised by numerous committees ap- 
pointed from the ranks of industry, labor, and technical groups. These 
advisory committees convey the statistical needs of the groups from 
which they come to the governmental agencies, aad often provide tech- 
nical advice as well. This type of advisory committee should be con- 
tinued and expanded if government statistics are to continue to serve 
broad public uses. 

I can say from personal experience that, whife these governmental 
agencies did not seek the exceptional responsibility which the new 
uses of our statistics have introduced, they have taken the responsibil- 
ity very seriously. In the opinion of impartial observers, such as the 


Task Force headed by Professor Frederick C. Mills for the Hoover | Mi 


Commission, the important statistical series produced in government 
are now administered with competence and honesty and a good deal 
of skill. That is not to say that they could not be improved, for they 
could be. But the statisticians who make these statistics know better 
than anyone that they were never intended to be accurate to the last 
decimal, as many of their users believe them to be. 
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» By this I do not mean that these statistics are not correctly calcu- 
lated arithmetically. As I have just indicated, I believe they are. Arith- 
metic is easy to check, and statisticians as a rule do arithmetic well, 
Most of these major economic indicators are either double-calculated 
by two different sets of clerks, or they have built-in mechanical checks 
of various kinds. 0% course, there are bound to be occasional errors. No 
systém is infallible. But those errors are accidental. So don't worry 
about that last decimal! The figures will usually be arithmetically cor- 
тесі, according to the ground rules laid down for the calculation in 
question. You cannot “rig” a summary figure with 150,000 prices, or 
millions of people, in it, calculated by a whole battery of individuals, 
even if someone wanted to—and I do not believe they do. 

Rather, I am referring to accuracy in its statistical sense. This de- 
pends upon the ground rules, The question is: Was this statistical 
series designed, like the standard meter or the inch, to be accurate to 
a very small fraction of a point? For most of these general-purpose 
national economic indicators, the answer, of course, is “No,” I repeat— 
no such accuracy was needed for the purposes for which these statistics 


one time as another, and he 
with equal accuracy, except in most unusual circumstances. It must be 


accepted as it stands, and the public must know that. 
The next question, then, 


fic cases. But take the major ones: Of the indicators I have named, 
only the Census, the parity indexes, 


indexes were desi d for the s cific purposes 
which they now Serve—and none of ne 3 pe 
noris that kind of accuracy required 

, However, in many cases, I believe th. 


г governors” are price indexes, 
construction statistics. They have, in effect, 
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rough justice to the parties concerned, provided they chose the ap! 
propriate statistical series in the first instance. And for many of the 
large contracts this has been true. 

Now, given these new, more weighty, апа more widespread uses of 
statistics, what do we, as statisticians, owe the trugting public which 
accepts our products on faith, often with a fateful Sentence which, be- 
gins, “Statistics prove that . . . " 

We owe the public, I believe, these things: Arithmetic accuracy; con- 
tinuing observance of the ground rules for compilation, or a warning 
if they are to be changed; fairness and impartiality in handling the 
results; a clear description of the general nature and limitations of sta- 
tistics, with a simple measure of their accuracy, if itecan be measured, 
and a warning if it cannot. We cannot, and we should not, try to ex- 
plain every technical detail, or to make statisticians of everyone, any 
more than all of us should try to understand all the mechanical details 
of how our automobiles work. 

We owe the public also, if we are asked, advice as to whether a cer- 
tain statistical measure is suitable for a given use or not, and an educa- 
tion in the fact that statistics must and will change. They cannot be 
expected to be compiled in exactly the same way indefinitely. They get 
out of repair and go out of fashion, just like other products, if they are 
not revised from time to time, and so responsible statistical agencies do 
revise them. What would you think of a city constfmers’ price index 
which today had the lamp chimneys and high-button shoes of 1900? 
You may be sure that some of the articles in the indexes of today will 
look just as peculiar in 25 years. Therefore, anyone making a contract 
or setting up any other long-term use for a statistical series should be 
told that he should make provision for periodic shift-overs to revised 
statistics, if he does not wish to upset the contract. In my opinion, it 
would be wise alse to provide for periodic review of whether circum- 
stances have so changed that other measures—or no statistical meas- 
ures at all—would be preferable. For example, it should be remembered 
by those who employ price indexes in contracts that in periods of price 
stability, which have not been uncommon in our history, such escala- 
tors as these indexes provide may either not be needed, or since they 
are only approximate measures of trends, they may not be accurate to 
a sufficiently fine point so that their changes have any real significance. 

The public should know, too, that there is no real assurance that 
the statistics they use today will be issued at all 10, 15, or 50 years 
from now. Although they are authorized by law, there is nothing ex- 
cept custom and the enthusiasm of their sponsors to keep them going, 
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‘except, as in the case of the Census, or the parity index, where they 
have a statutory directive. Other statistics may be discontinued be- 
cause they have served their primary purpose—although this is un- 
likely to be true of the major economie indexes—or they may be 
changed of eliminated because of lack of basic data or of financial or 
organization support. In general, in government, as in private agencies, 
budgets are reviewed each year and they are often changed, with con- 
siderable danger to the continuity of statistical service. As the Steed 
Subcommittee of the House Committee on Education and Labor ob- 
served in its report on the Consumers’ Price Index in 1951: 
“The subcommittee believes that the Consumers’ Price Index has become 
so important that it must be regarded as a fixed charge upon the Govern- 
ment; it should not be subject to yearly fluctuations in budget and at the 
same time be required to do the same amount of work. ... Unlike some 
Government programs, the issuance of a statistic of this type depends al- 


most completely upon continuity of effort. It is impossible to cut the work 
one year and increase it the next on the same project. ... ” 


Therefore, the users of statistics and the profession must be educated 
to take some responsibility for continuity of statistics where that is 
important—for today it is not assured. 

Finally, statisticians owe the public, I believe, absolute assurance of 
continued competence, of honesty and fairness in the calculation of 
these statistics—snd indeed of all major statistics, whether they are 
“trigger figures” or not. Note, I say “assurance.” I mean that we, 88 
statisticians, need not merely to be competent, fair, and honest, but we 

‚ need to be able to prove to a statistically unsophisticated public that, 
in fact, our statistics are trustworthy. 

This is not merely a public relations problem, although its public 
relations aspects are very important, but one of fundamental, long- 
run institutional changes which will assure that statistics continue to 
serve in the public interest. I believe that the statistical profession and 
its sister professions in the social and the natural sciences, where sta- 
tistics are used, should immediately take the initiative and assume 
responsibility for this public assurance. 

i Consider what we do not have in our profession, which other profes- 
sions have. There is in our profession, for example, nothing comparable 
to the Certified Public Accountant, or “Member of the College of 
Surgeons.” There is no such label as a “Certified Public Statistician.” 
There is no certification of the source of statistics like the stamp “U.S. 


* House Committee on Education and Labor, Subcommittee to Study Consumers’ Price Index, 82d 
Congress, 1st Session, Subcommittee Report No. 2, 1951, p. 39. 5 емеш 
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€ 
inspected—choice, good or commercial,” and few statistical brand 
names that mean anything except to the statistically sophisticated. 
Moreover, for our products, there cannot be standardization, and there 


is only voluntary labeling of the contents, in footnotes whjch are often | 


too technical for the public. There is no audit comparable to the audit, 
of a firm of certified public accountants, invited Фу a corporation to 
audit its books, in order to assure the stockholders and the public not 
merely of the accuracy of the arithmetic of its accounting, but also 
of the validity of the way in which the books are kept. 

The time has come, in my opinion, for the statistical profession, in 
its own interest, to consider devising some means by which these func- 
tions can be performed for important statistical sefies. Such a sugges- 
tion implies no doubt of the present validity of existing statistics, nor 
should it be considered to reflect in any way upon the professional 
standing or integrity of the statisticians concerned. An accountant 
does not feel incompetent because he is not a CPA, but for certain work 
a CPA is required; a company is proud of a “U.S. inspected” label; an 
accountant does not feel that it reflects on his integrity because an out- 
side accounting firm audits his books. In fact, he welcomes the verifica- 
tion of his procedures and his accuracy. And so should statisticians. 

What are those guarantees? What underlies them? 

The fundamental guarantee of integrity of any statistics, now or in 
the future, in publie or private agencies, as the Mifls Committee! ob- 
served, lies in the quality and the competence of the people who com- 
pile them. While there is no statistical equivalent of admission to the 
bar, over the past three decades we all know that there has been a great" 
improvement in the quality of statistical work; in basic statistical train- 
ing in the colleges; and jn standards set for statisticians in public and 
private agencies. There are now advanced degrees in Statistics, with 
fine training in methodology. In the public service, this Association has 
helped the Civil Service Commission to define ‘standards for various | 
grades of work. What is now most lacking in statistical training in col- 
leges, I believe, is an adequate appreciation of the*fact that statistics 
are tools to be applied to subject matter. A highly trained mathemati- 
cal statistician without knowledge of the subject can be quite as dan- 
gerous as a subject-matter specialist trying to use statistical methods of 
which he has little knowledge. Statistical students should therefore be 
given a well-rounded training in both areas, Advanced statistical train- 


* The Statistical Agencies of the Federal Government, A Report to the Commission on Organiza- 
tion of the Executive Branch of the Government, by Frederick C. Mills and Clarence D. Long, of the 
Research Staff of the National Bureau of Economic Research, 1949. 
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ing should certainly involve practical field work experience wherever 
that is at all feasible. In view of the fact that statistics today are big 
business, students should be taught how the important, large-scale 
statistics are made, and how the individual statistician's role is often 
to serve as one cog in а big, intrieate, mass-production process. Teach- 
ers will also do well to give emphasis to an uncodified, but none-the- 
less real, standard of ethics, which demands of the statistician accuracy, 
impartiality, and integrity in all of his work. 

Beyond good training, comes experience in formulating a statistical 
project, defining the terms, managing a study. The crucial nature of 
this area of statistical administration is too little recognized. These new 
uses of statistics make it all important. 

With these considerations in mind, I hope that this Association will 
take positive action to carry out some of the excellent suggestions on 
training made by Samuel Wilks when he was president of this Asso- 
ciation; and at the same time, consider again the possibility of setting 
up examinations for Certified Public Statisticians, remembering that 
that label should only be applied to those who have been proved com- 
petent in method, in subject matter, and in application. 

Finally, I come to the question of reassuring the public that statistics 
which are so widely used in the public interest are competently and 
fairly compiled, and are designed to meet public needs adequately. It 
is time, I believe, for the profession as a whole to share some responsi- 
bility for these statistics with those who make them. 

Since so many of the important statistical governors and statistics 
used for policy-making purposes in the United States originate in Fed- 
eral statistical agencies, we might first consider how this problem 
could be handled for government statistics. (And here I must emphasize 
that I speak as a private individual, and not as а representative of à 
government agency.) , ө 

І propose that there be created a new United States Statistical Com- 
mission, with résponsibility for audit of statistical Series, similar to an 
accounting audit, empowered to put a "certified" label on a statistical 
product. It Should also be charged with investigation of methods, 
scope, and suitability of statistics, and with making recommendations 
for future improvements and developmental work. Such a Commis- 
sion in some respects would be similar to the “Boards of Visitors” 
га e universities have organized to report to the Board of 

ees; in others like the Inspector General of the Army; and in 


e like the present Research and Development Board in scientific 
Bue > 
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The members of such a Commission might be selected from a. panel 
of names suggested by this Association and allied associations dealing 
with important subjects to be considered by the Commission. Primar- 
ily, its membership would be drawn from experts outside government 
who have had actual experience in such operating problems as are 
faced in governmental statistics, in addition to knowledge of methedol- 
ogy. Like the United States Central Statistical Board, created in 1933, 
however, it might also have as members heads of some of the important 
Federal statistical agencies, serving either ex officio or on а consultative 
basis, where questions of Форе and method were involved, but not, of 
course, an audit of their statistics. 

It should be a continuing body, serving on occasién as required, but 
with а small full-time staff, and adequate financing, so that our most 
distinguished statisticians, economists, scientists, and other specialists 
could reasonably be expected to devote time and attention to its work. 
Its reports should be made to the highest executive authority, and be 
generally available to the Congress and to the public. 

In making an audit of an important series, such a Commission should 
rely on sample checks to verify the accuracy of the basic data and 
the calculations, and should carefully examine the established pro- 
cedures to see whether they are being followed scrupulously. It should 
devote a good share of its attention to making suggestions for changes 
in procedures, if those currently in vogue are not in accordance with 
the best methods known at the time of review, and to considering 
whether the data are technically adequate for the uses which are being 
made of them. After such an examination, the Commission should be 
in a position to give these statistics a stamp of Spproval—“Certified 
Public Statisties"—or tq withhold that stamp awaiting improvement 
—and say why. I, for one, would welcome the constructive suggestions 
and the heightened public interest that would come from such a review. 
Private statistical series might also be reviewed by the Commission, 
on request. ы 

То set up such a Commission will involve both fhstitutional inven- 
tiveness and new ways and means. I cannot venture to propose a work- 
able blueprint in all its details, but I can observe that, for government 
statistics, this is certainly an opportune moment for the creation of 
such a Statistical Commission. This idea, of course, is by no means 
original. A Central Commission of Statistics, quite large in size and 
with somewhat different functions than I am suggesting, has operated 
with considerable success in the Netherlands since the 1890’s, and we 
might learn from their experience. 


ee 
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" The general goals I have outlined will not be easy to accomplish: 
To lift the level of statistical public relations; to make statistical train- 
ing more realistic; to devise means for establishing the title of "certi- 
fied public statistician"; to set up an independent Commission to re- 
view and. pass upon our most important statistics, governmental and 
private. 9 

But our profession has made some first steps in these directions. This 
Association has for many years appointed advisory committees to vari- 
ous governmental agencies, at their request—the Bureau of the Census, 
the Bureau of Labor Statisties, the Bureau of Mines, and others. These 
committees have served most usefully in dealing both with program 
content and with statistical techniques. Similar committees should con- 
tinue to be appointed in the future. However, they have performed ad- 
visory functions of a somewhat different character than are intended 
for the proposed new Commission. The Committee on Government 
Statistics and Information Services of 1933, out of which came the | 
U. В. Central Statistical Board and its current successor, the Office of 
Statistical Standards, was very valuable. Certainly the level of govern- 
mental statistics has been raised during the past 20 years, in conse- 
quence of its work. The contribution of the report of the Mills Task 
Force to the Hoover Commission, on government statistics, is such an- 
other step. Our Association's Commission on Statistical Standards was 
created with a mùch more limited scope, but also in the direction of 
such functions as are proposed for a Statistical Commission. Its sub- 
committees which participated with the Social Science Research Coun- 
cil in reviewing the results of the election polls in 1948 and of the sta- 
tistics underlying tho Kinsey Report were both very worth while. This 
Commission has not been very active, for it is yoluntary and part-time, 
and has no organic relationship to statistical organizations as such. 

These few steps, however, are evidence of professional recognition of 
some of the problems that are before us. Recently, these problems have 
become much more serious, for statistics have suddenly attained stature 
beyond our hopes--and perhaps beyond our desires. Their importance 
goes beyond our present ability to handle the problems which these 
new uses create with the imagiuation and authority which the situation 
requires. As a profession, statisticians must organize to meet this chal- 


lenge if statistics are to continue to be administered in the public 
interest. 


DATA FOR MEASURING THE EFFECTIVENESS OF 
PUBLIC INCOME-MAINTENANCE PROGRAMS* 
е 


Јлсов FISHER 2 
Social Security Administration . 

The effectiveness of the income-maintenance programs is 
best appraised in relation to program objectives. These may 
be considered as falling under two heads—coverage and bene- 
fits. Measures for appraising success in achieving coverage 
and benefit objectives are to be found in the periodic statistical 
reports and special studies of the administering agencies. Some 
of the mensures acquire significance only when placed in rela- 
tion to general social and economie data. Since agencies vary 
in the amount and character of the information they collect 
and publish, and gaps exist in general social and economie 
data, the resultant body of knowledge is incomplete, particu- 
larly with respect to benefit adequacy. 


HE income-maintenance programs with which the present paper 
"T asus &re publie programs administered by public authorities, mak- 
ing money payments to individuals or families based either on. need 
or on the occurrence of an event against which the individual and often 
his dependents are insured on the basis of past employment or service, 
More specifically, the programs are public assistance and the social 
insurance and related programs, i.e., unemployment insurance, dis- 
ability insurance, old-age insurance, survivors insurance, workmen’s 
compensation, and the compensation and pension programs for vet- 
erans. Industrial pensions, privately purchased sfecident and health 
policies, annuities and life insurance also help maintain income, but 
are excluded from consideration here because they present research 
problems of a someevhat different kind. 4 

The effectiveness of the income-maintenance programs is most ар- 
propriately appraised in relation to adequacy of covezage and ade- 
quacy of benefit. s | 

The social insurance programs provide protection for specified 
groups of workers against specified risks. A group is usually identified 
on the basis of industry, occupation or class of worker, or two of these 
categories in combination, and sometimes, in addition, by size of es- 
tablishment, or, as in some workmen's compensation programs, by the 

* Based оп а paper presented at the session on “Тһе Application of Statistics in Appraising the 
Effectiveness of the Income-Maintenance Programs," Committee on Statistics in the Social Sciences, 


American Statistical Association, Boston, December 28, 1951. Opinions expressed in this paper are those 
of the author and do not necessarily reflect the official views of the Social Security Administration. 
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hazardous nature of the employment. Further, protection under the 
social insurance programs is restricted to persons regularly attached to 
the employments covered, reflecting a general policy to exclude per- 
sons engaged only briefly or intermittently in jobs under the system. 

The coverage of the public assistance programs is defined in the 
eligibility conditions. Need, variously interpreted, is common to all 
of them, but they differ otherwise from program to program with re- 
spect to the age classes covered, the nature and severity of the disabili- 
ties recognized as bringing the applicant within the scope of the 
program, and the parental status of the children who could qualify 
for aid. 

The purpose of the benefit in social insurance and related programs 
is generally, but not always, the partial replacement of a wage loss. 
It is usually designed to cover the “basic” requirements of the bene- 
ficiary and, with the exception of veterans’ benefits, is related in amount 
to previous earnings. The relative emphasis given these goals varies. 
The proportion of wage loss compensated tends to be higher in the 
short-run programs, such as unemployment insurance and temporary 
disability, than in the retirement, survivor, and other long-run pro- 
grams; and it is higher in the programs with larger contribution rates, 
and in programs with a more specialized coverage. 

The purpose of the public assistance payment is to meet individual 
need. Wide divergence exists in the interpretation of need. The basic 
procedure in the determination of the assistance amount, not always 
honored in practice, is the comparison of an individual’s requirements 
and resources, the assistance payment representing the amount needed 
to bring total resources up to the requirements recognized in the 
assistance standard. States vary, however, іп their assistance stand- 
ards; and even within a State, differences may occur in the determina 
tion of need. While the States have made progress ‘toward establishing 
a defined content of requirements to which local administering units 
must conform'in the determination of the assistance amount, the iden- 
tification and appraisal of requirements and resources аге, to a con- 
visse Fora n largely discretionary with the local agency. 

: е re available for gauging the extent to which the | 
Income-maintenance programs achieve their goals? | 
, One may turn first to the statistical series maintained by the agen- 

cies administering the programs. Special studies constitute another re- 

source, as does that branch of inquiry concerned with the analysis of | 

ШЕ legal meg of the program, the rules and regulations issued by | 
е administering agency, and the decisions of appeals bodies. Not t0 | 
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be overlooked, either, are the uses to which data on the general char- 
acteristics of the population, or of selected groups in it, can be put, 
since often what is known about the beneficiaries of a program ac- 
quires significance only when placed in relation to the demographic 
universe to which these individuals belong. Although the inventory 
which follows consists largely of program statistical series, these fre 
not the only source of information for measures of program effective- 
ness. j 

Since all earners and their dependents are presumably in need of 
income security in old age, and the dependents are all presumably in 
need of survivorship protection, the extent to which the Federal old- 
age and survivors insurance program and the special systems for 
government and railroad workers cover all employments is one meas- 
ure of the coverage of the groups at risk. This may be calculated by 
reference either to aggregate earnings or to persons employed. 

Data on earnings and average number engaged in employments 
under the Federal old-age and survivors insurance program, the rail- 
road retirement system, and the Federal, State, and local government. 
retirement systems are available on an annual and for some programs 
on a quarterly and monthly basis. The relative protection of these 
programs may be measured by the proportion which earnings in covered | 
employments represent of total earnings in the Department of Com- 
merce national income series; or by the proportion which the number 
in covered employments constitutes of the total number reported in 
the Census Bureau’s monthly report on the labor force. (Note that not 
all State and local government workers are covered by a retirement 
system.) 2 

In the Federal old-agee and survivors insurance and railroad pro- 
grams, data have been develdped as well on all persons in the covered 
employment during some part of the year. Becagise of the constant 
movement between covered and noncovered employment, more per- 
Sons are likely to have some earnings credits under a given program 
during the year than in any one month. During 1956, for instance, 48 
million persons earned wages in employments covered under the old- 
аре and survivors insurance program, but average monthly covered 
employment was 35 million. For some purposes such during-the-year 
employment figures are better than the during-the-week figures, and 
may be used in relation to total during-the-year employment as es- 
timated by the Census Bureau and the Social Security Administration. 
Unlike the case in the average monthly employment estimates, 
а оО: cae ez РАЕН НИ pipe ose 


1 Bources for statistical series referred to are listed in the Note below. 
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during-the-year data cannot be added together because some indi- 
viduals appear in several series. 

Eligibility for retirement or survivor benefits in the old-age and 
survivors insurance and railroad systems is conditioned upon "insured 
status" which moasures attachment to covered employment, or years 
of Service. А more refined measure of the protection afforded than the 
number of persons in covered employment is the number with insured 
status or with service credits sufficient to make the worker or his sur- 
vivors eligible for benefit іп case the event insured against should occur. 
Annual data on the number of earners with insured status are avail- 
able for the old-age and survivors insurance and railroad programs, 
as well as data on the number of years with earnings credits or service. 
These data may be related, for analytical purposes, to data on total 
employment, after adjustment for the number of insured persons who 
have left the labor force but still retain insured status. 

The employment and insured status data and the data on years 
with credited wages or services may be had by age, sex, industry (oc- 
cupation in the railroad retirement program), and for some items, by 
color, making possible analysis of the bearing of these factors on the 
extent of protection. The 1 per cent continuous work history sample 
of the Bureau of Old-Age and Survivors Insurance has yielded several 
studies of the effect of worker mobility and changes in economic con- 
ditions on patterns of covered employment for different age, sex, and 
industry groups. 

Measures of the coverage of extended disability programs may be 
found in the payroll and employment series of the railroad retirement 
System and of Federal, State, and local governments, to which refer- 
ence has already been made. Some earnings and employment series 
are also available to measure the coverage of the temporary disability 
programs in the 4 States with such legislation and in the railroad in- 
dustry, Analysis of the extent to which the disability insurance pro- 
grams cover gil Persons presumably in need of such protection may 
be made by reference to national earnings and employment data issued 

a National Income Division of the Commerce Department and by the 
nsus Bureau. 


er extent of Protection against the risk of unemployment is re- 
ш comparisons which may be made between earners and earn- 


ings in covered employment and total w: d 
5 \ age and salary workers an 

ы earnings. Data оп total payrolls and payrolls PA to unem- 
Ployment insurance contributions, by State, are available on an annual 


and quarterly basis, while data on the number of workers in covere 
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employment during an average pay period are published on a monthly" 
and during-the-year basis. Ап annual series also exists on the number 

eligible for benefits on the basis of covered earnings during the year, 

i.e., the number with insured status. Similar data are issugd for the 

leona unemployment insurance program. 

Information on the coverage of workmen’s compénsation programs 
is distinctly limited. Until the past decade the emphasis in statistics 
in this field was upon accidents and relatively little attention was paid 
to the insurance protection provided. The Social Security Administra- 
tion annually prepares estimates of covered payrolls for the country 
as a whole, but estimates are available for a few individual States only 
and these are in some instances incomplete becaus& limited to par- 
ticular types of insurers (1.е., private carriers, self-insurers, State fund). 
Variations among the States in the employments covered, and in the 
kinds of insurers permitted, and the elective character of the law in 
some States, make for considerable differences in the information col- 
lected and in the comparability, from State to State, of what аге pre- 
sumably the same items. 

Another group of coverage questions relates to the extent to which 
the income-maintenance programs are providing income to persons 
who have experienced an income loss, i.e., the nonworking aged, the 
widows and young children of deceased workers, the nonworking dis- 
abled, and the unemployed. 

Estimates of varying degrees of accuracy and currency are available 
for the size of each of these groups. The number of aged nonearners in 
the country may be obtained from the Census Bureau's monthly report 
on the labor force, while the number and labor fofce status of widows 
and their dependent children may be estimated from that Bureau's 
annual estimates of the family composition and marital status of the 
population. The number of unemployed is readily, obtainable from the 
Same monthly report on the labor force. Estimates of the number of 
nonearner disabled persons in the working years of life, by age and 
sex and duration of incapacity, have been developéd on the basis of 
special studies made in conjunction with the Census Bureau's monthly 
population survey. 

The extent to which the social insurance and related programs pro- 
vide protection against a given income loss may be measured by relat- 
ing the number of beneficiaries in each of these classes to the total 
number experiencing the loss, after adjustments are made for differ- 
ences in definition. The appropriate program data are available to & 
greater or lesser ektent for all programs with the exception of work- 
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* men's compensation (available in some States, however). For some 
programs, the data come with age, sex, race, industry or occupation 
detail. In some of the disability insurance programs there is detail 
also on the kind and duration of disability; and in the temporary dis- 
ability program, the number of spells of illness in the benefit year. 
These details are of importance in the analysis of the coverage effec- 
tiveness of the program for particular groups of workers. The relative 
restrictiveness of the disqualification provisions of State unemploy- 
ment insurance laws is reflected in the ratio of disqualifications to the 
estimated number of spells of unemployment. Data on the dependents 
of beneficiaries in relation to the provisions for dependents in the benefit 
schedule contribute to the appraisal of the completeness of protection 
with respect to this aspect of the program. 

Тһе effective coverage of the public assistance program can be 
measured only by reference to the total number of needy persons in the 
community. The development of measures of need has, however, long 
defied our ingenuity and for good reason. Need is relative and its 
recognition varies from place to place and over time with living stand- 
ards, fiscal resources and other factors. The Department of Commerce 
issues annual estimates of the per capita income of the individual 
States; these are at best only rough measures of State differences in 
need. Distributions of family income by county will be available 
shortly from thé 1950 census; when analyzed by family size they 
should provide valuable indicators of the relative number of families, 
county by county and State by State, living below a given income 
standard. Their usefulness in measuring need will be limited however 
by the time lag between the collection and publication of the data, 
differences from place to place in living casts, and by the different 
meanings which will be attached in each;county to the significance of 
the data. d 
ji Even if we could agree on a uniform standard for determining need, 
it would take qothing short of a house-to-house canvass to ascertain 
the number of peisons in the community with incomes currently below 
the given standard. Public assistance must be applied for: the number 
of needy persons who might qualify but do not apply for public assist- 
ance is not known. What can be measured is the proportion of public 
assistance recipients in the total population, or in such classes as the 
aged. In June 1952, for instance, old-age assistance recipients рег 
1,000 persons 65 years and over varied from 45 in the District of Co- 
lumbia to 631 in Louisiana. Recipient rates indicate the extent 10 
which population groups are dependent in whole or part upon public | 
assistance. Whether they also measure the extent to which need 8 
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being met through publie assistance will depend, among other things, * 
on the scope of the publie assistance law, its interpretation by the 
public assistance agency, and public sentiment. 

Monthly data are available on the number of recipients of each 
type of assistance. Recipient rates are published semi-annually; the 
population base in old-age assistance is the number óf persons 65 years 
of age and over; in aid to dependent children, the number of children 
under 18 years; and in aid to the blind, in aid to the permanently and 
totally disabled and general assistance, the total population. The com- 
parability of State rates in the aid to the blind and disability assistance 
programs is affected by differences of unknown magnitude in the 
relative number of blind and disabled persons, State«by State. Special 
studies have developed, in addition, information on the personal and 
family characteristics of recipients, the labor force history of old-age 
assistance recipients, the types and duration of disability encountered 
in the aid to the blind and aid to the permanently and totally dis- 
abled programs and the parental status and reasons for dependency of 
children receiving aid to dependent children. The periodic data and 
the special studies throw valuable light on the influence of differences 
in State eligibility standards and in the incidence of social problems 
in the population, as well as the effect of changes made or proposed in 
social insurance and public assistance legislation upon the relative 
number and kinds of persons receiving public assistance in different 
parts of the country. 

The coverage discussed thus far relates to individual programs. 
What is the net coverage of all the programs concerned with a given 
risk group? How many aged persons, for instance, are receiving income 
from Federal old-age and survivors insurance, the railroad and gov- 
ernment retirement programs, the veterans program, or old-age as- 
sistance? It is, of ceurse, possible to bring together the relevant figures, 
not only for the aged, but for the other risk groups as well, suchas the 
survivors, the unemployed, and the disabled. The diffigilty with these 
compilations is general lack of information on the mumber of persons 
receiving income from two or more programs. The size of the groups 
receiving both old-age and survivors insurance and old-age assistance, 
and both old-age and survivors insurance and aid to dependent child- 
ren, has been available on a semi-annual basis since the middle of 
1951, but with this exception, methods have still to be devised for 
eliminating duplication among beneficiary and recipient rolls. In their 
present form the data are useful in indicating the minimum size of the 
group at risk lacking protection. 

Still another approach to net coverage, this time with reference to 
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" State differences, is the use of per capita expenditures for aggregate 
social insurance benefits. State differences in this average presumably 
reflect, differences in State provision for given risks (unemployment or 
temporary disability, for example); however, they also reflect program 
coverage limitations which have a differential impact on States because 
of differences in the industrial composition of the working population 
(old-age and survivors insurance, for example). State differences in 
the coverage of need may be measured, similarly, by reference to State 
per capital expenditures for public assistance. The latter average is 
affected, of course, by differences in the extent of need, as well as by 
differences in statutory provisions. 

The raw materials for the appraisal of benefit adequacy in the social 
insurance and related programs include data on aggregate payments, 
the number of beneficiaries, the size distribution of benefit payments, 
the wage loss being compensated, the dependents of beneficiaries, the 
proportion of beneficiaries exhausting benefit rights (in the short-term 
programs), the income of beneficiaries, and living costs. 

Aggregate payments are available for all programs; for the work- 
men’s compensation and State and local government retirement pro- 
grams, the aggregates are estimated. With the exception of workmen's 
compensation, data or estimates are also at hand on the number of 
beneficiaries. This means that for most programs the average benefit 
can be computed. Size distributions, which are a better guide to the 
analysis of benefit adequacy than the average, are published for the 
Federal old-age and survivors insurance system, the Railroad Retire- 
ment Board programs, the Federal Government's retirement system, 
апа State unemployment insurance. Size distributions are not avail- 
able, at least in published form, for other goyernment retirement pro- 
grams, the veterans program, the State temporary disability programs 
and workmen's compensation. АП the short-run piograms collect and 
cane data on the proportion of beneficiaries exhausting benefit 

8. an 
vies О typés a data mentioned in the first paragraph of this 
2. Wage loss sustained by beneficiaries, the dependents of 

A eficiaries, the Income of beneficiaries from benefits and other 
sources, and living costs for the most common family groupings en- 
countered among beneficiaries, at selected living levels—are not, with 


some exceptions, the by-product of program operations and are 0b- 


tainable only through special study. Especially noteworthy in this 


connection are the field surve: сопа А " 
Administration in ys conducted by the Social Security 


the past 10 years of the income and living arrange- 
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“ 
ments of old-age and survivors insurance beneficiaries. А number of 
States have made similar studies in the field of unemployment insur- 
ance. With these exceptions there is а general lack of quantitative 
data in the social insurance programs on the relation of ¿he benefit 
amount to family requirements and the place of the benefit in the family 
income structure. . 

Certain derived measures have been found useful in benefit ap- 
praisal which are not dependent upon field study. These include 
average benefit as a per cent of the average wage of all workers in 
covered employment (this is not the same thing as the average wage 
of beneficiaries prior to retirement, disability, or unemployment), 
maximum benefit under the law as a per cent ой average wage in 
covered employment, per cent of benefits which are at the maximum, 
the per cent of beneficiaries receiving public assistance, and changes 
over time in the average benefit as compared with changes over time 
in the Consumer Price Index of the Bureau of Labor Statistics. To a 
greater or lesser extent measures of this kind have been developed in 
the Federal old-age and survivors insurance program and in the State 
unemployment insurance programs. They help answer such questions 
as, How much wage loss is compensated by the benefit? Has the benefit 
formula kept pace with changing wage and price levels? Has the insur- 
ance program reduced the need for public assistance? The answers to 
these questions have an important place in legislative consideration 
of the need for program amendment. 

The tools for measuring the adequacy of payments in public assist- 
ance are, with some exceptions, those referred to in connection with 
the social insurance and related programs: data on hggregate payments, 
the number of recipients, the size distribution of payments, the income 
of public assistance recipiertts, the size and characteristics of public 
assistance families, the trend in average paymept in relation to the ` 
trend in the cost of living, and general information on living costs in 
relation to family size and composition. А ў 

Statistical series on aggregate payments, recipiénts, average рау- 
ment per recipient, and the size distribution of payments are available 
for all five types of publie assistance on а monthly or less frequent 
basis. Average payment and size distribution measure differences 
among the States in the amount of assistance available to recipients. 
They reveal little, however, on the relation of the payment to require- 
ments and on the extent of other income in the family. Special studies 
have developed information along these lines for old-age assistance 
and aid to dependent children recipients in selected States, but because 
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‘resources available to assistance recipients and their treatment by 
assistance agencies change with time, the applicability of the findings 
is limited to the period covered by the studies. Similar information is 
lacking for the other programs. 

Major gaps in knowledge concerning the extent to which the income- 
maintenance programs achieve their objectives are summarized below. 

With respect to coverage, there are no reported data and no regu- 
larly published estimates of the number of employees in employments 
covered by State and local government retirement systems or by 
workmen's compensation. Coverage and benefit data in three of the 
four State temporary disability insurance programs are incomplete. 
Current estimates of beneficiaries under workmen's compensation are 
lacking. With some exceptions, information is unavailable on the 
number of beneficiaries and publie assistance recipients receiving in- 
come from two or more programs, 

Benefit data are deficient in а number of respects. There is no in- 
formation for the country as а whole on size of benefit in the fields of 
workmen’s compensation and State and local government retirement 
systems. Average payments are available for veterans’ benefits but not 
size distributions of benefits. Generally missing in all social insurance 
and related programs with the exception of Federal old-age and sur- 
vivors insurance and some State unemployment insurance programs 
are any efforts at appraisal of the benefit in relation to other sources 
of income, family requirements, and living costs. With these excep- 
tions, again, no materials have been developed to relate the benefit to 
the wage loss, or to estimate the number of beneficiaries receiving 
public assistance supplementation. In publie assistance the chief de- 
ficiencies are in current data on the nonassistance income of recipients 
and organized and analyzed informationoon the budgetary standards 
used by public assistance agencies for estimating requirements. 

With the exception of public assistance, unemployment insurance; 
and Federal old;age and survivors insurance, small-area data on bene- 
ficiaries and their benefits are generally lacking. 

More difficult to classify because not directly related to program 
operations are deficiencies in the documentation of some of the com- 
шопеві generalizations in the field of social Security. We say the aged 
prefer insurance to assistance. How do we know this? Has anyone 
bothered to find out? To what extent does а contributory system of 
social insurance, with benefits paid as a matter of right and based on 
earnings, preserve “self-reliance and initiative . 


. . dignity and inde- 
pendence,” reward “ambition and effort,” and олык private 
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savings, as claimed by the House Ways and Means Committee ifi 
1949? What proof do we have that the Bureau of Employment Se- 
curity is correct in stating that unemployment insurance “sustains the 
morale and conserves the skills and standard of living” of the unem- 
ployed? Is there any objective basis for the claim of the Kiouse Ways 
and Means Committee that under a contributory social insuyance 
system “productivity is encouraged,” the “Nation’s total production 
is increased,” and that such a system helps “to protect the Nation 
from serious economie maladjustment?"* 

То ask these questions is to call attention to the complete absence of 
attitude research in the income-maintenance field. Such research may 
have no direct bearing on day-to-day program operations, but the 
lack of it is painfully apparent when public assistance comes under 
legislative or newspaper attack, or when the alleged merit or lack of 
merit of the reserve fund in old-age and survivors insurance is publicly 
aired. It would be helpful then to know what the general public thinks 
of the issues involved, what the beneficiaries or recipients think, and 
what the views are of the rank and file staff member working in 
the program, newspaper editors, legislators, and others. 

Reference should be made also to the paucity of bench marks by 
which to measure benefit adequacy. The lacunae here are the primary 
responsibility of other kinds of programs. We lack detailed information 
on the relative importance of different sources of income at various 
income levels, at different ages, and among different family types and 
sizes. Research in this area would put in another light the social and 
political issues posed, for instance, by the contradiction between the 
income a good public assistance standard would provide and what the 
family earner in a low-paid job could earn. 


* ` NOTE 
Published sources for the statistical series referred to in this article 
are listed below. 
Earners, earnings, insurance status, benefit payments, арі "beneficiaries under 
Social insurance and related programs: 
Federal Security Agency, Social Security Administration, Social Security Bulle- 
tin (monthly), including Annual Statistical Supplement, published as part 


of the September issue. 
Federal Security. Agency, Social Security Administration, Bureau of Old-Age 


? House of Representatives, Committee on Ways and Means, Report to Ассотрату H.R. 6000, Re- 
port No. 1300, 818 Congress, Ist session, August 22, 1949, pp. 2, 3. 

3 Department of Labor, Bureau of Employment Security, Unemployment Insurance: Purposes and 
Principles, Dec. 1950, р.е. 
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and Survivors Insurance, Quarterly Summary of Wage, Employment and 
Benefit Data (Feb., May, Aug., Nov.). 
— Handbook of OASI Statistics (annually). 

Department of Labor, Bureau of Employment Security, The Labor Mar 
and Employment Security (monthly), including monthly Statistical Supple: 
ment and quarterly supplement Employment and Wages of Workers Cove 
dy State Unemployment Insurance Laws by Industry and State. 
—Significant Temporary Disability Data (annually). 

Railroad Retirement Board, The Monthly Review. 

—Annual Report. 

U. S. Civil Service Commission, Retirement Report (annually). 

Veterans Administration, Statistical Summary, Claims (monthly). 
—Unemployment Allowance, Claims and Payment Data (monthly). 
—Self Employment Claims and Payment Data, by Agencies (Monthly). 
—Annual Report. 


All State unemployment insurance agencies issue recurrent data periodically, | 
greater or lesser detail. The publications of the California, New Jersey, amt 
Rhode Island agencies contain, in addition, data on the temporary 41за Ш 
programs in these States. 


Program data of varying degrees of completeness may be found in the reports 0 
State workmen’s compensation agencies. Some but by no means all State n 
local government retirement systems include beneficiary data in their annua 
reports. | 


Recipients and payments under public assistance programs: 


Federal Security Agency, Social Security Administration, Social Security ВШ 
letin. Recipients and payments appear monthly. Recipient rates are сой 
puted semi-annually and published in the March and October issues. Аппій 
per capita expenditures for public assistance are to be found in the Mar 
issue. 
—Assistance Payments Under State-Federal Programs. Annual release on th 

size distribution of payments for the month of September. 


Periodic data are published by all State public assistance agencies, 
» 
Economic status of selected population groups: $ 


Federal Security Agenty, Social Security Administration, Social Security Bull 
tin, June and December. Semi-annual estimates of the number of aged p 
sons, widows, and paternal orphans with income from employment, 8008 
insurance and related programs and public assistance. 


General population and economic data useful in developing coverage estimate! 


Department of Commerce, Bureau of the Census, Monthly Report on the La 9 
Force (Current Population Reports, Labor Force, Series P-57). я 
—Annual release, usually for the month of April, on marital status, famil 

composition, and household characteristics of the population. (Curreti 
Population Reports, Population Char. istics, Series P-20.) 
— Bureau of Foreign and Domestic Commerce, Office of Business Economi 


Survey of Current Business (monthly), for estimates of:personal income, 
source, and State per capita income. 
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Some special studies referred to in the text: 


Theodore D. Woolsey, “Estimates of Disabling Illness Prevalence іп the 
United States.” Public Health Monograph No. 4, Public Health Service, 
1952. 

Edna C. Wentworth, “Resources of Aged Insurance Beneficiaries: 1951 Na- 
tional Survey,” Social Security Bulletin, August 1952. 

Edna C. Wentworth, “Income of Old-Age and Survivors Insurance Вбпе- 
ficiaries, 1941 and 1949," Social Security Bulletin, May 1950. 

Lelia M. Easson, “Adequacy of the Income of Beneficiaries Under Old-Age 
and Survivors Insurance,” Social Security Bulletin, Feb. 1948. 

The Role of Unemployment Compensation in Maintaining Family Income and 
Expenditure in an Area of Critical Unemployment. National Opinion Re- | 
search Center, Chicago, 1951. 

A Report on Benefit Financing and Solvency of the Employment Security Fund in 
Rhode Island. Rhode Island Department of Employment Security, Provi- 
dence, 1950, 

Aid to the Blind Recipients With Earnings in September 1960, Public Assistance 
Report No. 19, Social Security Administration, 1952. 

Charles E. Hawkins, “О14-Аре Assistance Recipients: Reasons for Nonentitle- 
ment to Old-Age and Survivors Insurance Benefits,” Social Security Bulletin, ` 
July 1952. 

Elizabeth Alling and Agnes Leisy, Aid to Dependent Children in a Postwar Year. 
Social Security Administration, 1950. 

Ruth White and Thomas G. Hutton, Requirements and Incomes of Recipients of 
Old-Age Assistance in 21 States in 1944, Public Assistance Report No. 13, 
Social Security Administration, 1948. 

“Living Arrangements and Physical Condition of Aged Recipients, 21 States,” 
Social Security Bulletin, June 1947. 

Ralph G. Hurlin, Sadie Saffian, Carl E. Rice, Causes of Blindness Among Re- 
cipients of ТЕГІ to the Blind. Social Security Administration, 1947. 

Families Receiving Aid to Dependent Children, October 1942, Public Assistance 
Report No. 7, Social Security Administration, 1945. 
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AN APPRAISAL OF THE 1950 CENSUS 
INCOME DATA* 


Herman P. MILLER 
Bureau of ihe Census 


mE establishment of a program designed specifically to measure 

the quality of the data obtained in the Census is one of the impor- 
tant differences between the 1950 Censuses of Population, Housing, 
and Agriculture and those previously conducted. Several reports on 
the quality of the data obtained in the Census, which will incorporate 
the findings of the Post Enumeration Survey (a reinterview survey 
conducted by highly trained enumerators after the Census was com- 
pleted) and the record matching studies, are in various stages of com- 
pletion. The purpose of the present article is to provide an interim re- 
port on the reliability of the 1950 Census income data. 

In this paper, comparisons are made between the income size distri- 
butions obtained from preliminary tabulations of the 1950 Census and 
those obtained from the income survey conducted by the Census Bu- 
reau in March 1950 as а supplement to the Current Population Sur- 
vey. In addition, estimates of aggregate income derived from each of 
these surveys are compared with independent estimates prepared by 
the National Income Division of the Department of Commerce.! The 
comparison of the aggregates emphasizes the well established fact that 
data in all field surveys of income are subject to errors of response and 


* The author is greatly indebted to Edwin D. Goldfield and Gertrude Bancroft of the Bureau of the 
Census who directed most of the work summarized in this paper. Thanks are also due to Leon Paley and 
ER id their able assistance in all phases of the work. „ 

of the 1950 Census data to be discussed were obtained from a preliminary sample of Census 
returns which will be referred to as the Preliminary Sample Tabulations (PST). These data will be 
compared with information obtained in the Census Bureau's Current Pépulation Survey (CPS) and 
with the income aggregates Prepared by the National Income Division of the Department of Com- 
merce (NID). Both the CPS and PST data are based on samples and are therefore subject to sampling 
NS Figures based on relatively small numbers of cases, as well as small differences between figures, 
ould be used with particular care. Estimates of the magnitude of sampling variation for the СЁ8 
figures may be obtained from the Census Bureau report, Series P-60, No. 7 and corresponding figures 
for бе PST data may be found in the report, Series PC-7, No. 2. 

d Virtually all analyses of income data obtained from household interviews show that this method 
ER understatement of income. The most complete appraisal of the available data on the site 
2 ution of income appears in Volume 13 of Studies in Income and Wealth, published by the Natio! 

ureau of Economic Research. The articles іп this volume by Goldsmith; Mandel; and Wasson, Нш“ 

witz, and Schweiger are particularly useful in this connection. The Bureau of Labor Statistics report: 
Дал Spending and Saving in Wartime, Bulletin No. 822, contains a detailed appraisal of the income 

ta obtained in the Survey of Family Spending and Saving in Wartime as well as brief indications 9f 
the extent of underreporting of income in other field surveys such as the Consumer Purchases Study, 
1055-1056; Minnesota Income Study, 1088-1089; and the 1040 Census, The only detailed аргані 
рш by бе sa th Come Baca nme Pw manana memor 
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The Census Bureau has noted in almost all of its income reports that 
income information obtained in household interviews is largely *based 
on memory rather than on records, and in the majority of instances 
on the memory or knowledge of some one person, usually &he wife of 
the family head. The memory factor in data derived from field sur- 
. + ГЛ 
veys of income probably produces underestimates because the tendency 
is to forget minor or irregular sources of income." At the same time, 
the data suggest that despite these deficiencies, the information ob- 
tained in the field surveys may be sufficiently reliable for most uses. 
In addition to the defects inherent in the household interview ap- 
proach, the 1950 Census had some special problems in regard to the 
collection of income data. For example, the enumerators used in the 
Census were less skillful, less adequately prepared for their jobs, and 
more burdened with procedural details than are the enumerators used 
in the Current Population Survey (CPS). In addition, although the 
“line” schedule used in the 1950 Census was excellent for certain pur- 
poses, it was not well adapted for the collection of family income data.‘ 
Despite these handicaps, the 1950 Census income data show distribu- 
tions quite similar to those obtained in the CPS, which has produced 
a consistent and useful series of national estimates of the distribution 
of consumer income for each year since 1944. The similarity of the 
CPS and Census income data despite the limitationsnoted above may 
in part merely reflect the fact that within the framework of the collec- 
tion techniques currently employed by the Census Bureau, the differ- 
ences which will be obtained by alternative forms of questioning will 
not be striking. This was demonstrated in one of the 1950 Census pre- 
tests.5 It is also possible that some of the defects inherent in the 1950 

3 U. S. Bureau of the Census, Tm Population Reports—Consumer Income, Series P-60, No. 9, 
“Income of Families and Persons in the United States: 1950,” March 25, 1952, p. 19. 

4 The family income dita in the 1950 Census were obtained by asking questions separately (а) for 
the head of the family, and (b) for all relatives of the head as a group. 'fhis method does not probe as 
deeply as the CPS income supplements in which income questions are asked individually for each mem- 
ber of the family. с 

5 Several different types of income questions were tested by the Bureaucsf the Census in the April 
1948 Current Population Survey for the purpose of developing a simplified method of obtaining income 
data, suitable for the 1950 Census. A total of 25,000 households were interviewed by trained enumera- 
tors and the following alternative sets of income questions were asked of systematic subsamples of the 
total sample. 

Type 1.--Оп one-half of the schedules the enumerator was to ask the specific amount of (a) 
wages and salaries; (b) total money income; for each person 14 years and over. 

Type 2.—On one-fourth of the schedules the enumerator was to ask the specific amount of 
total money income for each primary family and for each person not a member of а primary 
family. 

Туре 3.—On one-fourth of the schedules the enumerator was to ask the respondent to indicate 


on a card the broad class interval into which the total money income fell for each primary family 
and for each person пФ a member of a primary family. 


The findings of the field tests of each of the questions are presented in the table below. One of the 


e 
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Census were balanced by better respondent cooperation on the aver- 
age, despite occasional strong individual objection. The legal provisions 
of the Census may have stimulated more accurate reporting on the 
part of respondents. 

* TABLE 1 
PST AND CPS ESTIMATES OF NUMBER OF INCOME RECIPIENTS 


AND MEDIAN INCOME, BY TYPE OF INCOME, FOR 
THE UNITED STATES: 1949 


ж—————_————5————== 


CPS 
PST (Noninstitutional | 
Number of income recipients and (Total population 14 years 
median income, by type population old and over, | 
of income _14 years old excluding members 
and over) of armed forces 
on post) 
S ec nibo utu с ш 20 и м 
"Total population. 111,926,000 109,644,000 | 
Total number of income recipients 72,054,000 71,768,000 
Number of recipients of: 
Wage or salary income.............. 54,974,000 54,912,000 
Self-employment income & 10,674,000 10,966,000 
Income other than earnings...... yis 18,813,000 16,875,000 
Median income: 
АНАИРА ИСРА $1,909 $1,814 
Wage or salary income... t $2,032 $2,016 
Self-employment income. . . $1,586 $1,039 
Income other than earnings $ 478 $ 496 


"sm cM ek AUD Me EATON m 00$ oM 
® 
INCOME OF PERSONS 
1. Comparison of CPS and PST income data for persons. Essentially 
the same techniques were used to obtain personal income data in the 
March 1950 Currené Population Survey and in the 1950 Census. There- 
fore, it is of some interest to compare the results. Table 1 presents 8 
summary of the-numbers of income recipients and medians for each 


striking conclusions derived from these tests was that asking fi = m | 
family group (type 2) and asking for information asking for a global figure on total income for 
LMA nackte cus ЫШ ооа for өз її 


| 
MEDIAN TOTAL MONEY INCOME FOR FAMILIES | 
AND INDIVIDUALS BY TYPE | 

OF SCHEDULE, FOR THE UNITED STATES, URBAN AND RURAL: 1947 | 

| 
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type of income. Although the data require adjustment to make them 
comparable with respect to population coverage (see text below), they 
provide a basis for comparing the PST and CPS income distributions 
at the most erucial points. Ж 

With the exception of income other than earnings, PST and CPS ap- 
pear to have reported about the same number of recipients of each type 
of income. The apparent similarity of these estimates is somewhat mis- 
leading inasmuch as the two surveys are not directly comparable with 
respect to population coverage. The PST included about 1.6 million 
institutional inmates,’ and about 0.3 million members of the armed 
forces living on military posts who were excluded from CPS. Since 
these persons are in the population base from which the PST estimates 
of the numbers of income recipients are made, it is apparent that the 
PST estimates are somewhat overstated as compared with CPS." If 
the institutional inmates and members of the armed forces living on 
post were excluded from the PST weighting of the data, the total num- 
ber of income recipients in PST would have been estimated to be about 
70.6 million rather than 72.1 million as shown in the table. Experience 
in the reconciliation of other estimates of this type indicates that the 
discrepancy of about 1 million between the CPS and the adjusted 
PST estimate of the number of income recipients is relatively small. 
For example, considerably greater differences in the estimates of the 
number of paid workers during a given year are usually obtained from 
the CPS surveys of work experience conducted in December and the 
income surveys conducted in April. 

Tt is also apparent from Table 1 that with the exception of income 
from self-employment, the medians obtained for each type of income 
in PST and CPS were abut the same. However, a closer look at the 
full distributions by, income levels in Table 2 indicates that PST con- 
sistently shows a larger proportion of income recipients in the upper- 
income brackets. The percentage differences do not appear great; how- 
ever, since the upper-income groups possess a large shafe of the aggre- 
gate income, the small percentage differences shown in the table yield 
relatively large differences in the aggregates. (See page 33 ff. for a de- 
tailed discussion of the aggregate comparisons.) 


* Income information was very poorly reported for institutional inmates in the 1950 Census. Of the 
1.6 million inmates of institutions 14 years of age and over, an estimated 1.1 million did not report on 
income, 0.3 million reported no income, and 0.2 million reported $1 or more of income. 

3 No adjustment was made for the fact that the PST estimate of the civilian noninstitutional popu- 
lation is 0.4 million greater than the CPS estimate since this represents a difference in the independent 
estimate of the same роршафоп rather than a difference in population coverage. The difference between 
these two figures arises from the fact that the population estimate used to inflate the CPS data is a 
Projection of 1940 Census data, whereas the PST estimate represents an inflation of a sample of 1950 
Census returns. 


E 
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The discrepancy of approximately $600 between the PST and CPS 
estimates of median self-employment income is rather large. A full ex- 


planation for this difference is not available at present. Perhaps the | 


Census obtained more accurate income data than CPS in farm areas 


because of the use of the detailed farm schedule in the Census of Agri- | 


culture, which tended to improve the estimate of net income from farm 
self-employment on the population schedule. This improvement would 


have been reflected in less underreporting of income or in the higher | 


TABLE 2 


PST AND CPS DISTRIBUTIONS OF PERSONS 14 YEARS OF AGE AND 
OVER AND:OF AGGREGATE MONEY INCOME, BY TOTAL 
MONEY INCOME, FOR THE UNITED STATES: 1949 


CPS 


Noninsti- 

PST tutional 
^ population 
‘otal 14 years old 

Total money income population Agere- ana over 
14 years old ae (excluding 
and over members 
of armed 

forces 

on post) 
(thousands) (billions) (thousands) (billions) 


Aggre- 
gate 
income 


111,926 $171.4 109,644 $158.0 


12004 — 71,768 — 

Per cent of those with income. 100.0 100.0 100.0 100.0 
Loss or $1 to $499........... 17.8 1.8 18.9 2.1 
$ 500to$ 999.. ў 1845 № 43 13.8 4.1 
$ 1,000 to $1,499. . 10.9 57 4 108 6.1 
$ 1,500 to $1,999. PUE 1022 7.5 10.4 8.3 
$ 2,000 to $2,499... .. 11.2 10.6 11.6 11.9 
$ 2,500 to $2,999 . 9.1 10.6 9.5 11.9 
$ 3,000 to $3,499. . . 9.0 193 8.7 12.8 
$ 8,500 to $3,999. . 5.6 8.9 5.4 9.2 
$ 4,000 to $4,499. 4.0 71 3.5 6.8 
$ 4,500 to $4,999. 2.2 4.4 2.0 48 
8 5,000 to $5,999. 2.9 6.8 2.3 5.7 
$ 6,000 to $6,999. 1.3 3.6 14 3.2 
8 7,000 to $9,999 1.4 5.0 1.0 3.9 
$ 10,000 and over 14 11.5 10 9.1 
Median income... 41,09 — — $1,814 — 

; 
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e 
median income of farm families. (See Table 6.) It is also possible that 
the Census income data benefited from the self-enumeration procedure 
which was supposed to have been used in the Census of Agriculture 
in all farm areas except the South; however, in the absence оё any firm 
data on the effects of self-enumeration or the extent to which it was 
actually used in the Census of Agriculture, this point can only bf a 
matter of conjecture. The only available evidence on this point comes 
from a pretest of the Census conducted in October 1948 in Union 
County, Indiana, and Carroll County, Kentucky, to determine the 
relative merits of self-enumeration and direct-enumeration techniques. 
This pretest showed that in about 60 per cent of the cases where the 
self-enumeration procedure should have been used, the Census enu- 
merators rather than the respondents actually completed the schedules. 
Even where the respondents entered the income information on the 
self-enumeration schedules, a large proportion of the entries were ap- 
parently rounded to hundreds of dollars indicating that the respondents 
probably did not take advantage of the advance distribution of sched- 
ules to refer to records. Moreover, there were no significant differences 
in the medians produced by each procedure. 

2. Comparison of CPS, PST, and NID aggregates. Although the pri- 
mary purpose of the income questiong in the 1950 Census and in the 
March 1950 Current Population Survey was to provide a distribution 
of the population by income levels, estimates of the aggregates of each 
type of income can be derived from these data. The comparison of 
these estimates with those prepared by the National Income Division 
of the Department of Commerce provides further information regard- 
ing the reliability of the 1950 Census income data. Such a comparison 
is made in Table 3. The Census aggregates shown in this table largely, 
Tepresent estimates derived ffom the tabulated data and are subject 
to substantial errorg of estimation. Moreover, despite the fact that the 
NID aggregates are largely based on record data and may therefore 
be considered more reliable than the field survey datag they are also 
Subject to errors of estimation. For these reasons, small differences in 
the results must be interpreted with care. 

The PST and CPS estimates of aggregate income shown in the table 
below were computed by multiplying the estimated number of per- 
Sons in each income interval by the average income for that interval. 
For intervals between $500 and $10,000, the midpoint of each interval 
Was assumed to be the average; $250 was selected as the average for 
the “Loss or $1 to $499” level, and $20,000 was selected as the average 
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for the $10,000 and over level. The aggregates were obtained by mul- 
tiplying the frequencies by the average income for each interval. The 
NID aggregates with which the PST and CPS aggregates are compared 
were obtained by adjusting the Personal Income Series of the Depart- 
ment of Commerce to make it more comparable with the definition of 
income and the population coverage of the Census income data.° 


TABLE 3 


PST, CPS, AND NID ESTIMATES OF AGGREGATE INCOME, 
BY TYPE OF INCOME, FOR THE UNITED STATES: 1949 


(Billions) 
CPS 
PST (Noninstitutional NID 
О А а < (Total 
: years old an 2 
Туре of income Dopuledon over, excluding populstiog 
14 years old 14 years old 
and over) merbers of the and over) 
armed forces 
on post) 
Total cums HP E oan $171.4* $158.0* $187.6 
Wage or salary income. ... 124.3 120.0 127.5] № 
Self-employment income. .... © 81.1 26.51 29.3 
Income other than earnings... . 16.6 18.8 80.8 


* The detail by type of income 4 ot add to the m 
. mated independently. perds о the total because each of the aggregates was 68 


t Nonfarm self-employment estimate: $19.1 billion; imate: $7.4 billion. - 

+ Nonfarm self-employment pied. $19.4 billion; nd. terius Eae. A he 
j Perhaps the most striking fact shown in the above table is that the | 
PST obtained about 90 per cent of the. comparable NID estimate of 
aggregate income. This proportion is considerably greater than that 
obtained in any of the CPS income surveys which have been conducted 


to date. The March 1950 CPS obtained only 84 per cent of the NID 
aggregate. ж 


й paper by В. Е. Goldsmith, “Appraisal of Basic Data Avail: 
able for Constructing Income Size Distributi AID i 
Volume 13 (National Bureau of Economic meme) а г о 
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For the reasons noted in a preceding section, the PST and CPS Eu 


mates of aggregate income shown in the table are not directly com- 
parable. If institutional inmates are excluded from the PST tabula- 
tions, the estimated aggregate income is reduced from $1%1:4 billion 
to $168.0 billion. An additional adjustment to exclude members of the 
armed forces living on post from the PST data reduces the aggregate 
about $1 billion more to about $167.0 billion. Thus, the actual differ- 


TABLE 4 


PERSONS 14 YEARS OF AGE AND OVER BY TOTAL MONEY 
INCOME IN CPS AND IN THE CENSUS, FOR THE 
UNITED STATES © 


Paid workers 


Comparison of CPS and Census All Wage or Self- 
persons Total salary employed 
workers | workers 


Total in вашр1е................ 5,701 3,118 2,529 589 
Total reporting on income in CPS 
and Сепзив................ 4,898 2,607 2,165 442 
Per cent in same income interval 
in Бойу ар АЙ ЛЕ ef 49 52 31 
Per cent in higher income inter- % 
yalin ОРВИ SUE 20 27 25 38 
Per cent in higher income inter- 
val іп Сепвив.............. 18 25 23 31 


$1,914] $2,326 ¢$2,389 $1,797 
$1,849] $2,238 $2,319 81,543 


% 
* The income intervals used were thos$ shown in Table 2 with the exception that $6,000 to $9,999 


was treated as a single intereal. 
t These figures differ from those shown in Table 2 because of sampli®g variation, 


ence between the PST and CPS aggregates when adjusted for popula- 
tion coverage is about $9 billion. 

3. Income data from СР8-Сепвив Match. About one-fifth of the per- 
sons who reported on income in the March 1950 CPS were also required 
to furnish income data in the 1950 Census. The differences between the 
two sets of reports show the extent of variability of response in these 
surveys. They also indicate some of the concomitant effects of this 
variability on the income distributions. The data in Table 4 are based 


on an analysis of the i income reported by 5,700 persons 14 years of age . 


and over who were interviewed in both CPS and in the 1950 Census. 


' the same income interval in the Census. Some indication of the extent 
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Of the 4,898 persons who reported on income in both surveys, 61 “ 
per cent were in the same income interval in both surveys, 20 per cent 
were in a higher income interval іп CPS, and 18 per cent were ina 
higher ingome interval in the Census. (About 82 per cent of the per- 
sons reporting on income in both surveys were either in the same in- | 
te»val or in one higher or lower adjacent interval.) Despite the fact 
that nearly two-fifths of the respondents were reported at different 
income levels in each survey, there was no significant difference be- 
tween the medians or between the distributions by income levels. The 
similarity in the distributions largely reflects the fact that income was 
relatively overstated in each survey about as frequently as it was un- | 
derstated. Thesé reporting and enumerative errors generally tended | 
to cancel, leaving the income distributions relatively unchanged. De- | 
spite the similarity of the over-all distributions, there may be impor- 
tant differences between the CPS and Census income distributions for 
certain segments of the population. The relatively small size of the 
CPS-Census Match sample precludes any detailed examination of the 
income effects associated with variability of response at the present 
time. A detailed study of this characteristic will be included in the | 
analysis of the results of the Post Enumeration Survey. However, the | 
analysis of income differences by class of worker (Table 4) and by age, 
sex, and veteran status (Table 5) does not indicate that variability | 
alae: resulted in any significant differences for these character- | 
istics. 

Over four-fifths of the persons reporting no income in the March | 
1950 CPS also reported no income for 1949 in the Census, but only | 
about one-half of the persons reporting some income in CPS were in | 


of variation in response among income recipients may be obtained | 
from the figures for "paid workers in Table 4. About*one-half of the wage | 
or salary workers were in different income intervals in both surveys; | 
however, the e eM for these workers do not differ significantly. А 
might be expected, there was considerably greater variation of rê- | 
sponse for self-employed workers than there was for wage or salary 
Workers. About 70 per cent of these workers reported different incomes 
in the two surveys. Even here, however, incomes were underreporte 
almost as frequently as they were overreported. 


Table 5 indicates that about thr | 
} hree-fourt; ош 
pared with only one-half of the н 


men reported the same income in both 
surveys. However, nearly half of the women in the sample but only 6 
per cent of the men reported no income. Among income recipients, 
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TABLE 5 < 


CONSISTENCY OF INCOME REPORTING IN CPS AND IN THE 
CENSUS FOR PERSONS 14 YEARS OF AGE AND OVER BY 


SEX, AGE, AND VETERAN STATUS 
e 


Per cent in same 
income interval in CPS 
and the Census 


ж- 


Median income 
Sex, age, and veteran status ^ Persons Income 
reporting recipients CPS Census 
onincome in CPS 
in CPS and ала 


Census Census v 


61 44 $1,014 $1,849 
КЕ 49 44 2,432 2,878 
14 to 24, Тоба!............... 56 43 1,054 1,289 
14 to 24, Veteran of World War 
ТЕЛИ see MPa EO 48 48 2,185 2,231 
14 to 24, Nonveteran of World 
Wanll M ЦЕ 58 41 722 897 
25 to 44, То%а1............... 43 43 2,904 2,850 
25 to 44, Veteran of World War 
18 4 ee НЕО 43 © 43 2,955 2,918 
25 to 44, Nonveteran of World 
Marl. убу acute MAD miter 43 43 2,860 2,778 
45 to 64..... qu 46 46 2,644 2,646 
65 and over. . $9 48 44 1,037 1,234 
Female. . Жаз 73 43 910 941 
14 to 24......... к 12 47 838 992 
25 іо44......... Pts 73 43 * 1,271 1,298 
45 to 64... Ў 72 41 888 915 
65 and оуек, e seas n sea . 69 45 575 497 


about the same proportion (one-half) were in the same income level 
for both sex groups. The medians and distributions by income level in 
CPS and the Census differed by negligible amounts for both males and 
females. 

Within each sex group, the proportion of income recipients who were 
in the same income level in both surveys did not vary by age or veteran 
status. The differences by age and veteran status in the median in- 
comes derived from the two surveys were also insignificant. 


INCOME OF FAMILIES 
1. Comparison of CPS and PST income distributions. The family in- 
come data obtained in the March 1950 CPS represent about the best 
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"estimate of the distribution of families by income level which can be 
made within the limitations of the collection techniques thus far used 
by the Census Bureau. For this reason, despite their shortcomings, 
these data provide a standard with which the 1950 Census income data 
сал be compared. 

Table 6 shows that there is no significant difference between the 
PST and CPS median income for families within each color and resi- 
dence group.!° The more detailed distributions by income level and 
color in Table 7 indicate a similar correspondence between the results 
obtained in these surveys. The one marked difference between the two 
distributions is the larger proportion of “zero-income” families obtained 
in the PST results. This deficiency in the PST data resulted largely 
from the enumerative and editing procedures used in the 1950 Census 
and is discussed in some detail below. 


TABLE 6 


PST AND CPS ESTIMATES OF MEDIAN TOTAL MONEY INCOME OF 
FAMILIES, FOR THE UNITED STATES, URBAN AND RURAL 
— o u u u uM 


All classes White Nonwhite 
Area 

PST CPS PST CPS PST CPS 
United States...... $3,068 $3,107 — $3,216 $3,232 $1,425 81,650 
Таз 2 3,420 3,486 3,581 3,619 1,850 2,084 
Rural nonfarm..... 2,552 2,763 2,711 2,851 1,141 1,240 
Rural farm........ 1,734 1,587 1,936. 1,757 733 691 
top косу лы шу dU ү з зб: 088 О 


Although the medians obtained in CPS and PST are substantially 
the same, patterns of small differences which appear in the data may 
be of some significance. Among both white and nonwhite families the 
‘CPS medians in urban and rural-nonfarm areas aro slightly (though 
not significantly) higher than the corresponding PST results; however, 


CPS, ees dite PST showed somewhat higher medians than 


tween CP§ and PST are attributable to the fact that college students 


living away from home were includ ү Ё Е 
not in the Census, i is likely that the as are members in CPS bu 


ly that the higher CPS results in these areas 


as family members in CPS and as unrelated individuals in the Сеш 
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reflect the superior schedule design and better enumeration. The гез» 
sons for the higher PST estimates of farm income have already been 
diseussed (see page 32). 

Table 7 indicates that there were perhaps four times as many fami- 
lies without income in PST as in CPS. In absolute terms these families 
numbered 1.6 million in PST as compared with 0.4 million in CPS. The 
relatively large number of families without income in PST is partlf ac- 
counted for by the editing procedures used in the 1950 Census. About 
one-third, or 0.5 million families without income in PST, were cases in 
which the family head had no income and income information for other 
relatives was not obtained. Although many of these families probably 
represented cases in which the head had no income but other family. 
members did, they were all tabulated as having no income aceording 
to the income editing rules used in the 1950 Census (see discussion of 


TABLE 7 


PST AND CPS DISTRIBUTIONS OF FAMILIES BY TOTAL MONEY 
INCOME, BY COLOR, FOR THE UNITED STATES: 1949 


Total White Nonwhite 


Total money income ——— ————————— ———————— 
PST CPS PST CPS PST CPS 


Number (thousands) 38,788 39,193 35,411 35,988 3,377 3,205 

"Total. keene oer 100.0 100.0 100.0 100.0 100.0 100.0 
Percentreporting.... 93.9 = 93.8 ” 95.9 ? 
Per cent not reporting 6.1 * 6.2 p 4.1 s 

Total reporting. ... 100.0 100.0 100.0 100.0 100.0 100.0 
Моле Е 4.1 0.9 3.9 0.9 6.3 0.9 
Loss, or $1 0 $499.... 4.5 5.0 87 42 110 140 
$ 500to$ 999.... 6.8 6.2 5.7 5.8 18.4 16.0 
$ 1,000 to $1,499... 7.3< 7.3 6.5 6.6 15.8 15.1 
$ 1,500 to $1,999. . e. 7.5 7.6 7.0 7.1 12.6 18.5 
$ 2,000 to $2,499.... 9.3 10.2 9.0 19.0 11.9 12.9 
$ 2,500 to $2,999... . 9.0 10.4 9.2 10.5 7.0 9.2 
$ 3,000 to $3,499.... 10.9 11.2 11.4 11.7 6.2 5.6 
$ 3,500 to $3,999... . 8.7 8.8 9.8 9,5. 8.0 8.7 
$ 4,000 to $4,499.... 7.2 6.7 Zu 7.1 2.1 2.6 
$ 4,500 to $4,999.... 4.9 5.3 5.8 5.6 1.0 1.9 
$ 5,000 to $5,999... . 7.9 7.8 8.4 8.8 1.8 2.5 
$ 6,000 to $6,999. ... 4.8 4.8 4.6 5.1 1.0 1.3 
$ 7,000 to $9,999... . 4.7 5.0 5.1 5.4 0.8 0.5 
$10,000 and over..... 2.9 2.6 3.2 2.8 0.2 0.3 
Median income. ..... $3,068 $3,107 $3,216 $3,232 $1,425 $1,050 


* Comparable figureg not available. 
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effects of Census editing procedures below). At present little is known 
about the remaining 1.1 million “zero-income” families. Nearly half of 
these families probably had no income as defined in the Census, if the 
CPS figure of 0.4 million is taken to be accurate. They may have been 
living on shvings, charity, or gifts, or else they were newly created 
families or families in which the sole breadwinner had only recently 
died or left the family. The remainder of these families appear at the 
"zero-income" level for a variety of reasons, such as errors in enumera- 
tion and coding. 

2. Comparison of PST aggregates obtained from the tabulations for 
families and unrelated individuals and for persons. Тһе PST tabulations 
for families and unrelated individuals yielded a considerably lower ag- 
gregate ($154.5 billion)" than the comparable tabulations for persons 
14 years old and over ($168.0 billion). Theoretically, the two estimates 
should have been about the same. The lower estimates derived from 


———————_—_—_—___.____. 


FOR PERSONS 14 YEARS ОР AGE AND OVER 
---------- ы. 


Income received by this person in 1949 1f this person is a family head (see definition below) 
income received by his relatives in this household. 


Last year | Last year, how | Last year, how much | Last year | Last year, how | Last year, how much 


(1949), how much money money did he receive (1949), how money did his rela- 
much money | did he earn | from interest, divi- | much money tives in this house- 
did he earn | working іп Біз | dends, veteran's al- | did his rela- hold receive from 
working as an | own business, | lowances, pensions, | tives in this interest, dividends, 
employee {ог | professional | rents, or other in- household earn veteran's allowances, 
‘Wages or salary? | practice, ог | come (aside from working {ог pensions, rents, 07 
(Enter amount farm? earnings)? wages or sal- other income (aside 
before deduc- | (Enter net in- ary? (Amount from earnings)? 
before deduc- 
tions for taxes, Em 
ete.) LEAVE 
BLANK 
а... -- 
32а 320 в 
ы 
[Г] None [Г] None 
£p жалқы; 
ER 
[Г] None [Г] None 
юс асы» 
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the tabulations for families are largely attributable to the method used 
to collect family income data in the Census. 

Income information in the 1950 Census was obtained from a 20-per 
cent sample of the population. If а person came into the sample, he 
was asked questions 31a-c indicated in the excerpt from the 1950 Cen- 
sus schedule shown below. If the sample person was not a family head, 
questions 32a-c were skipped; however, if he was & family head, ques- 
tions 32a-e were asked in order to obtain income information for the 
entire family group. Although this procedure provided an unbiased 
sample of families and of persons, it increased the possibility of intro- 
ducing reporting errors. 


TABLE 8 у 


PER CENT OF FAMILIES AT EACH INCOME LEVEL UNDER $10,000 
WITH FAMILY INCOME GREATER THAN HEAD’S INCOME: 
PST AND CPS* 


Total money income PST CPS 
Under 91,000............... 19.3 27.1 
$1,000 to $1,999............ на 30.7 40.6 
$2,000 to $2,999............ ЕУ 29.6 38.0 
$3,000 to $3,999.............. IN 33.9 41.8 
$4,000 to $4,999. . DD 49.1 56.0 
$5,000 to $5,999. . 60.0 70.1 
$6,000 to $6,999 64.5 72.4 
$7,000 to $9,999 65.6 74.0 


* PST data showing the number of families with more than one income recipient for the $10,000 
and over interval not available. e 
As the collection procedure implies, the family income data are based 
on replies to income questions asked separately for the head of the 
family and for allerelatives of the family head as a group, whereas the 
income data for persons are based on replies to income questions asked 
for each person in the sample. The failure to obtain income information 
individually for relatives of family heads probably fesulted in under- 
reporting of income for this group. Evidence supporting this view is 
shown in Table 8 where it may be noted that at each income level the 
CPS data have a larger proportion of families in which the family in- 
come was greater than the head's income. This is equivalent to saying 
that there were more families having income recipients other than the 
head in CPS than in PST. 
3. Effects of Census editing procedures. In the interest of economy, & 
relatively simplé editing procedure was used in processing the 1950 
Census family income data. It was assumed that if the income informa- 


< 
e 
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tion was obtained for the family head, the total family income was 
known even when the income information was not obtained for other 
family members. This procedure is different from the CPS procedure 
where any incompleteness in the returns for a family would result in 
a not reported (NA) classification for the family. It was adopted be- 


® TABLE 9 
PER CENT DISTRIBUTION OF FAMILIES BY TOTAL MONEY 
INCOME, USING DIFFERENT NOT REPORTING 
(NA) CRITERIA 
—————————————————— 
Distribution if families 
^ Distribution in which income 


Total money income information was not 
rens Ln obtained for relatives 


of the head were 
tabulated as ХА 


g 
E 
8.85 


E 


oohoo 
få 
к 
a 


Loss or $1 to $499. 
$ 500%$ 999.... 
$ 1,000 to $1,499... 


ноочмот њ 


- 


- 


речное 
© o to i» I Oto i wia to M OS DOMES 


боләбоәэзо-әтооо 
а со к бо ел а оо MH оччень 


Median income. , ana $3 


e 
5 
” 
© 
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е 
head is generally the principal recipient. Although other editing pro- 
cedures were considered, only the one finally adopted fitted in as part 
of the general coding and editing scheme used in the Census. 

In general, the editing scheme used in the 1950 Census produced the 
desired results of keeping the NA rate low without seriously distorting 
the income distribution (see Table 9). The NA rate obtained using the 
1950 Census editing procedures was half of what it would have been if 
the alternative procedure had been used of classifying as NA families | 
in which no report was made for family members other than the head. 
The median income based on the procedure which was used was about 
$70 lower than the median which would have been obtained from the 
alternative procedure. Nevertheless, this procedure ‘created a down- 
ward bias in the statistics inasmuch as all the missing entries were con- 
verted to zeros, although some of them must have actually represented 
amounts. The approximate magnitude of this bias is reflected in the 
fact that the aggregate computed from the distribution of families and 
individuals by income levels prior to editing was $4 billion higher than 
the aggregate derived from the data as tabulated (after editing). This 
suggests that editing accounts for about one-fourth of the difference 
between the family aggregate and the aggregate for persons. 


summafy 


The 1950 Census data which are currently becoming available pro- 
vide, for the first time, statisties on the size distribution of income for 
each locality in the United States, as well as analytical tabulations for 
the country as a whole which could not be obtained from the sample 
surveys conducted in earlier years. Experienced users of Census data 
have learned to inquire very carefully into the quality of information 
obtained by a mass enumerat&on of the population. For this reason, an 
investigation of theequality of these data is appropriate at the present 
time. Unfortunately, a full report on the quality of the income data 
will not be available until the results of the Post Enumeration Survey 
and the various record matching projects are completed. However, 
the preliminary investigation described above suggests that the 1950 
Census income data should be satisfactory for most purposes. The two 
major conclusions of the investigation are: (1) the family income data 
obtained in the Census are of comparable level with those obtained in 
the annual income supplements to the Current Population Survey; and 
(2) the income data for persons obtained in the Census are somewhat 
more reliable than the family income data largely because the latter 
were obtained by £n inferior collection technique necessitated by the 
use of a line schedule in the 1950 Census. 


DELIMITATION OF ECONOMIC AREAS: STATISTICAL 
CONCEPTIONS IN THE STUDY OF THE SPATIAL 
STRUCTURE OF AN ECONOMIC SYSTEM* 


5 RUTLEDGE VINING 
University of Virginia 
RECENT monograph by D. J. Bogue describes the procedures used 
by the Bureau of the Census in grouping the counties of the 
United States into a new set of 501 areas called State Economic Areas. 
Along with the description of procedure, there is a large map showing 
the Areas, а tabie listing the counties in each Area, a table giving for | 
each Area some 88 pieces of numerical data regarded as indicating the - 
non-agricultural characteristics of the Area, and a similar table giving 
for each Area about 75 pieces of numerical data presented as indicating | 
the agricultural characteristics of the Area. 
The present discussion will concentrate upon the nature of these 
Areas that are designated as “functional groupings of counties” each 
containing a “distinctive economy.” Interpreted simply as a new set 
of areas for Census reporting intermediate in size between the county | 
and the State, the procedure described makes quite good sense. But | 
there are overtones in the discussion that will suggest to some readers | 
that more was intended, that the work was carried on to such effect 
that the component “economies” or “regional economic units” of this | 
nation have been discovered and approximately designated. It is this | 
that many readers will look for in such a work, and I believe this seek- | 
ing for a “natural” area unit for economic studies is based upon 8 | 
fundamental misunderstanding. 
In my opinion, the spatial structure of a human economy should be | 
regarded conceptually as virtually a continuum. As in other studies of 
phenomena having volume or spatial extension, empirical observations и 
must be made upon the contents of finite and arbitrary spatial units, | 
these empirical observations being viewed as providing an approximate 
conception of what would be viewed were the spatial units made al | 
bitrarily smaller while the contents were being made more dense: | 
From this point of view, I believe the student of the structure, funt 
tioning, and development of an economic system may regard as not | 
particularly relevant the criteria used by the Bureau of the Census? | 
oe eee cut eau Эу Ше Beau of the Села aa 


the counties of the United States, Bureau of the Census, W: pei 


Economic Research, University of Virgini assistance and cooperation of tie Bureau of Population 
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the designation of the Areas. While census data classified for 501 * 
areas provide more information than if classified for only 48 States, 
less information is provided in this way than if they were classified by 
the 3070 counties. I should be inclined to interpret the choice of the 
number 501 as an economic decision in the sense that presumably the 
cost of making this number 502 was regarded by some responsible 
person or group as being greater than the value of the information 
that would be added; and the particular boundaries selected I believe 
should and can be defended on grounds other than those that may be 
found in the needs of political economists studying how а human 
economy works and develops as an organization. 

I shall attempt to develop these points, first outlining the nature of 
the purposes that seem to me to give meaning to the criteria described 
as having been used in the designation of the State Economic Areas. 
In the monograph, these purposes are confused, I think, with others 
for which the criteria are not relevant. Accordingly, I shall discuss these 
other purposes in outlining a conception of regional or spatial structure 
in the specification of which such areas as are described in this mono- 
graph would play no essential role. 


I 


Those who were responsible for delimiting the Areas were confronted 
with the problem of selecting a working rule which, together with sta- 
tistical observations, would determine the boundaries of sub-areas into 
which the entire territory of the nation is to be partitioned. An in- 
definitely large number of rules would of course be possible alterna- 
tives to the one finally selected. The selection of а particular rule would 
be made on the basis of how that rule performs in designating the 
county groupings, this performance to be measured in terms of the 
usefulness of the county groupings so designated. The technical spe- 
cialists, who did the staff work and who professiorfally advised those 
making the final choice, had the problem of analyzing the respective 
performance characteristics of the many possible rules The decision- 
makers presumably chose that rule which they regarded as having the 
most desirable performance properties. 

The question that I raise now is this: In choosing a “best” rule for 
delimiting the areas, what purposes did the decision-makers haye in 
mind for these particular area designations to serve? The rule selected 
would possess optimal properties in its capacity for least expensively 
generating a set of areas that attains predetermined objectives. What 
were these objectives? 
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ә The primary responsibility of the Bureau of the Census is that of 
providing enumerated information, It gauges the current demands for 
information, anticipates future demands, recommends to a “higher” | 
authority budget expenditures upon information-gathering, seeks ош 
economieül means for obtaining specified information, helps this 
“higher” authority compare the costs of obtaining an increment to 
the total information gathered against an evaluation of the worth of 
this increment. It is continually being bombarded with requests for 
faetual data from publie and private administrative organizations. 
Тһе problem always confronting the Bureau is that of choosing best 
courses of action in meeting the demands made upon it for information: 
from the thousands of individual administrative decision-making unit. 

From this point of view, the monograph makes good sense in de- 
scribing the procedure adopted by the Bureau in defining its new geo- 
graphic classification: 

In tabulating and publishing data, the Bureau of the Census has found an 
increasing need for a set of areas intermediate in size between counties and 
States. For many statistical purposes, State units are too large and hetero- | 
geneous in their composition; whereas in many other instances county units 
are too small and too numerous to be usable. In the 1950 Census of Agricul- 
ture, for example, the use of State economic areas will provide cross-tabula- 
tions in considerable detail of data that previously have been available only 
at the State level. If made for individual counties, such tabulations would be | 
costly and more detailed than needed for most purposes... . State eco- 
nomic areas are relatively homogeneous subdivisions of States. They consist 
of single counties or groups of counties which have similar economic and 
social characteristics. The boundaries of these areas have been drawn in such 
в way that each State is subdivided into a few parte, with each part having 
certain significart characteristics which distinguish it from the other areas 
which it adjoins. . . . In general, wherever it is not imperative that totals be 
reported for each county, the State economic creas may be used to present | 
concise body of statistics for the entire Nation, by States and their principal 
parts. Considerable savings in publication space, tabrilation costs, and cleri { 
cal work can be “made through the use of these units instead of counties 
Also, State economic areas permit the tabulation of sample statistics in much 
more detail “Бап is possible for individual counties, It is for these reasons 
that State economic areas have been established by the Bureau. (p. 1) 


The rule selected b; imiti br 
Puls qs y the Bureau for delimiting the areas was 81 
1; The delimitation was made on the basis of statistical and othe! 
objective evidence; and homogeneity with respect to certain prè“ 
scribed economic and social indices was “the prime criterion 2 
judging the quality of the State economic areas delimitation.” 
2. Each area was required to satisfy minimum’size requirement 
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Those areas intended for general tabulations were each required* 
to contain at least 100,000 inhabitants. Those intended for agri- 

cultural tabulations were each required to contain at least 10,000 

farms. Those areas designated as Metropolitan state economic 

areas were each required to contain at least one central city of 

not less than 50,000 inhabitants, and the entire Metropoligan 

area, consisting of the county containing the central city and any 

contiguous counties satisfying certain population and employment 

requirements, was required to contain not less than 100,000 in- 

habitants. 

8. The areas were required to follow county lines in all cases, 
and all State boundaries were also State econamic area bound- 
aries, 

4. The procedure consisted of the following steps: (a) A tentative 
delimitation for each State was made, plotting on a county outline 
map for this purpose regional delineations and type of farming 
areas that were available from previous work of others; (b) these 
tentative delimitations were then tested with statistical data in 
accordance with the above requirements regarding size and homo- 
geneity with respect to specified indices on land use, level of 
living, population characteristics, type of farm composition, et 
cetera; (c) the tentative delimitations were thus revised and sub- 
mitted to other agencies for review—departments of agricultural 
economics of each State agricultural college, each of the State 
statisticians of the Crop Reporting Service, and persons or agencies 
representing a non-agricultural interest. 

5. The final review and determination of boundarées were made by a 
working committee consisting of members from the Census 
Bureau and from the Bureau of Agricultural Economics. 


I have stated wlfat seems to be the primary problem confronting 
the Bureau—economically meeting demands for enumerated informa- 
tion. Considering this as the main objective, I can readily conceive of 
the Bureau’s staff convincing me that the above rule for delimiting 
the Areas possesses something that could be regarded as optimal prop- 
erties, That is to say, the Bureau’s technical staff could demonstrate 
to my satisfaction that the set of areas generated by the above pro- 
cedure is a “better” set than that obtained, for example, by the different 
procedure used by Rand-McNally in its partitioning of the nation 
into a set of areas, “Better” refers to the Bureau’s particular purposes 
or objectives, which in general would not be the same as the objectives 
that Rand-MeNally’s areas were designed to serve. There now exists 


г 
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“ho systematic method or theoretical basis for analyzing the perform- 
ance properties of a procedure for obtaining a set of areas comparable, 
for example, to the theoretical basis that is available for analyzing a 
sample survey design for obtaining a set of observations upon a popu- 
lation. But the problem is in essence the same. Although there is no 
knewn way of applying a formal test of the efficiency with which a 
given procedure generates a set of areas satisfying predetermined re- 
quirements, the Bureau nevertheless would seek a way of determining 
a set of areas that is optimum in this sense of being economical. The 
point of view of the Bureau, one may suppose, would be reflected in 
what Morris H. Hansen—who, incidentally, is chairman of the Bureau 
of the Census Committee on Statistical Areas—had to say on another 
occasion and in connection with a different problem. “ . . . The over-all 
test that we [the Bureau of the Census] apply to a sample design is 
that it shall yield the desired information with the reliability required 
at minimum cost; or conversely, that at a given cost it shall yield the 
estimates desired with the maximum reliability.”! The objectives of 
the Bureau in the case of these area designations involve presentation 
as well as estimation; and it is only in such terms of efficient statistical 
estimation and presentation that I am able to understand the size 
requirements and the requirement of homogeneity with respect to the | 
prescribed indices as having operational meaning. 


п 


Up to now I have interpreted the Bureau's efforts as having been 
directed to the problem of determining an optimal set of geographic | 
areas for the partivular purposes which the Bureau serves. This is а0 
easily understood conception; but the monqgraph, іп my judgment; 
confuses the discussion by intermixing, along with such paragraphs 88 
have been quoted above, statements that imply an entirely different 
idea. At least I get the impression that within the minds of responsible 
members of the staff advising the Bureau there was an idea correspond- 
ing to some kind'of “natural area unit” containing something that coul 
be called a “distinct economy” that can be discerned or perceived as 80 
operating “unit.” Each State was to have been “divided into its ріш“ 
cipal units” such that “within each unit a distinctive economy р! | 
vails.” This notion would seem to imply that “distinctive economy” bg 
а term referring to а natural entity that any competent person Сай 
recognize when it is held up before him. In accordance with this ide? 


1 Hansen, Hurwitz, and Gurney, “Problems and Meth iness,” 
Journal of the American Statistical Association, 41 (Обур. ы 180 SnD Survey of Bush 
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of "natural area unit," there would be, not an optimum set of ama 
for a specified purpose among an indefinitely large number of possible 
sets of areas, but some certain and existent number of distinct quasi- 
organic entities; and the procedure that was sought by the Bureau, one 
may be led to suppose, was & procedure so designed as best to facilitate 
the discerning and setting apart of these entities. The 501 that were 
set apart would, correspondingly, not be an estimate of the “best” 
number but rather an estimate of the number. “Because Census data 
are the sole or principal source of information about many aspects of 
social and economic life,” the monograph states, “there has been an 
increased demand that the Bureau of the Census undertake to provide 
more detailed information about the major socio-fconomic areas іп 
each of the States. The State economie areas have been established in 
an effort better to serve that growing body of analysts... “who are 
conducting their work in terms of “functionally defined units of area.” 

The term “functional area” or “functional unit” is used at a number 
of places to describe the areas, but there is no explanation of what 
meaning the word “functional” is supposed to convey. When it is stated 
in the monograph that within each of the area units being sought 
there prevails a “distinctive economy,” an attempt is made to define 
this latter term. “The term ‘economy’ is used here in its broadest 
sense; it refers to the total adjustment which the population of an 
area has made to a particular combination of natural resources and 
other environmental factors.” 

This statement, however, evidently has no operational content. In 
an empirical field, a definition is supposed to serve the purpose of 
setting apart those attributes by which that Бейіш defined may be 
recognized and classified under the name assigned to it. The quoted 
statement offers the following instructions: When confronted witha 
geographic area containing (1) a population of human beings and (2) a 
combination of natural resources and other environmental factors, the 
research worker is to inquire into whether or not (1) has adjusted itself 
to (2); if adjustment is observed to have taken place, then the geo- 
graphic area in question is to be classified as containing a “distinct | 
economy.” But in accordance with these requirements any area whatso- 
ever inhabited by human beings engaged in exploiting “natural re- 
sources and other environmental factors” would qualify as an “area 
unit containing a distinctive economy,” for ordinarily the expression 
“engaged in exploiting” will do well enough as an equivalent for “ad- 
justment” in this context. 

No special meahing is explicitly assigned to “total adjustment,” but 
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one gets the impression that in some way the idea of “homogeneity” 
is associated with the ideas of “total adjustment” and “functional 
grouping.” The test for “homogeneity” that was improvised by the 
Bureau involved the following steps: 


The indices [intended as measures of “land use”, of the allocation of employ- 

? ment among broad classes of industry, of the “level of living", of population 
characteristics, of “type of farm composition”, of “value of farm products” 
by type of product, of “crops and yields", of “livestock production”, etc.] for 
all counties of a State were listed on worksheets, grouped according to the 
tentative delimitation. Sums of indices for each grouping were taken, and 
the mean value of each index for each group was computed. The index 
values for each county were compared with the mean values for the two or 
more groups in“which the county could be placed. Wherever there was indi- 
cation that a county would deviate less from its group mean for any index 
if it were moved to an adjoining group, the fact was noted. ... When all 
indices for a county had been compared . . . the notations of deviation were 
studied to determine whether or not jointly they indicated that the county 
should be reclassified. . . Because several index values were being em- 
ployed to determine homogeneity, the indices were seldom unanimous in 
placing a county either in one grouping or in another. The indices indicating 
change and those indicating no change were considered for their relative. 
importance in characterizing the two areas being separated. 


This is a plausible procedure for stratifying a population; and it i$ 
easily understood how the strata obtained may be found to have useful 
properties—for example as indicated above, in the improvement of the 
efficiency with which certain population parameters are estimated ог 
of the effectiveness with which information is classified and presented. 
But I see no grounds for calling them by names implying more than the 
term “stratum” implies. I can see no more grounds for calling the strata 
obtained by this procedure “functional groupings” or “distinctive econ- 
omies” than if the stratification had been made in accordance with | 
Some kind of criterion of “heterogeneity.” No explanation is given | 
which would convey an understanding of how the “adjustments” 
made by the population within any one of these strata may be regarded 
as different in essence from the “adjustments” made by the population 
within any other designated area. 

* "This seeking to meet the “needs” of political economists by provid- 
ing them with “functionally defined units of area” is based upon what | 
I regard as a misunderstanding of the needs of political economists: | 


The need for these area units was descri i h 85:3 
Ез Мез lescribed in the monograp | 


Numerous problems of maladjustment to the envi: В ota 
i environment, of improper use 0 
resources, of recurring or chronic conditions of unemployment or under | 
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employment in certain of these areas [‘that have emerged as a result of the* 
adjustment of the population to the physical environment] tend now to be 
regarded as problems of the Nation, and are studied by a variety of special- 
ists. Increased awareness of these conditions has accentuated the need for 
more knowledge about local areas. The growing concern about eroblems of 
these types has fostered а general tendency for analysts to descend below the 
level of State statistics, and to conduct much of their work in termg of 
smaller, more functionally defined units of area. 


In my opinion, the important needs of the political economist studying 
the structure and functioning of an economic system are illustrated by 
the questions that this statement begs. “Areas” do not “emerge” with 
this “adjustment” process. The areas have been where they are all 
along. Something else emerges, and while it is spoken*of as an economy 
or an economic organization, research workers have yet to arrive at a 
generally satisfying mode of empirically observing and concretely de- 
scribing its structure and the process by which it is supposed to develop. 
By what criteria do observers “become aware” of the “improperness” 
or the “maladjustment” characterizing the performance of an economic 
system? Certainly not by the mere enumeration of the contents of geo- 
graphic areas however defined, for these terms imply a comparison 
and express an evaluation. To perceive “maladjustment” or “improper- 
ness” or “underness” or “overness” ip an observed situation is to pos- 
sess the capacity for perceiving and making known by well-defined 
criteria the qualities characterizing states of “adjustment” and of 
“properness” in the performance of the organization of economic units. 

Work toward establishing this capacity can hardly be said to have 
begun. There are, to.be sure, judgments that are daily made to the 
effect that per capita incomes in certain areas are t60 low or that indus- 
trial development of certain areas is lagging or that the structures of 
the economies of certain ares are out of balance. But upon inquiry into 
the bases of these fadgments, one must agree, I believe, that they issue 
from our passions and sympathies, and one will search in vain for 
analytical criteria that will stand critical investigation. There are 
analytical criteria in economic theory by which the consistency of an 
argument in a social discussion may be judged. But no positive theory 
now exists which will account for the structural and operating fea- 
tures and developmental processes of a human economy as it performs 
under acceptable conditions of freedom for individual units and which 
would assist participants in & social discussion in evaluating how well 
or how poorly an existing economy is performing and in making explicit 
the meaning of some such notion as “significant divergence” from 
“proper performance.” The work at this stage is primarily experi- 
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* mental in the efforts that are being made to discover useful ways of 
describing the structural features and the operating and growth char- 
acteristies of ап economie system. The Bureau contributes to this 
work by responding to requests for information. But it cannot inform 
а research worker in regard to the set of areas he should obtain observa- 
tioas upon any more than it is in a position to specify for administrative 
agencies what set of areas is *best" for their respective administrative 
purposes. For a given research worker dealing with some given problem 
of description or of hypothesis-testing there may exist an optimum веб 
of areas upon which observations are to be made for this piece of re- 
search. There exists no unique or single set of “natural area units" as 
distinet operating entities and component parts of a human economy. 


ш 


The implication of the last statement is that the economy, in its 
spatial aspects of structure and functioning, is to be regarded as а con- 
tinuum. It occupies area but it is not to be identified with the area 
occupied, and it has an objective form of its own as pattern and struc- 
ture. The economy is an organization of economizing units. The be- 
havior to be studied is not that of the units but rather of the system 
of units. As a population system» the organization is characterized by 
structural and operating features, and these pertain to the system per 
se and not to any particular set of mortal economizing units that may 
exist at any specified point in time. 

A statistical description of the form and pattern assumed by this 
system may be made, I believe, without reference to any particular 
set of geographic sub-areas. For this purpose I would be inclined to 
develop certain of the ideas of Walter Christaller, Christaller studied 
what he looked upon as asystem of central places.? The central places 
are the cities within, an economy, and he classified cities into discrete 
types, each type having its characteristic function to perform within 
the System. The pattern that he described was that formed by the 
spatial orientation of these central places of various types. His princip 
contribution, I think, lies not in his description of what he thoughthe | 
saw as а particular pattern but in the steps that he took towards 9 | 
specification of the attributes of spatial pattern in general. His attri- 
butes of pattern include the following: operationally definable types 07 | 
classes of population clusterings or central places ҙа natural ordering 07 


gradation of these central places into а hierarchical system of center; | 


з Die Zentralen Orte in Suddeutschland, Jena, 1933. у 
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numerical regularity within this system with respect to the frequency 
of central places of a given type or size; a numerically specifiable spatial 
orientation of the different types of central places. 

More recently, Bogue, the author of the monograph reviewed above, 
studied what he regarded as the structure of metropolitan communi- 
ties.? He did not present data for individual metropolitan communities 
but only average figures for two large classes of communities for each 
of several parts of the nation. As distance from center is increased, his 
data describe population density as sloping downward in approximate 
conformance with a definite rule. Whereas Christaller’s conception 
calls to mind а configuration of population density peaks, Bogue's 
study suggests a filling in around each peak to form а flensity configura- 
tion that may be conceptually thought of as specifiable in terms of a 
density function. Thus, the statistical conception of a density function 
may be employed in this description, and Christaller’s notion of a 
system of central places may be modified into the idea of a spatial 
density configuration of economic units. This latter would be observed 
if the locations of the family and individual units at a point in time 
were to be plotted as coordinates in space, and Christaller’s central 
places would be represented by the points within the area at which 
localized peak densities are observed, 

From this view, the United States, or any other large populated 
area, appears as an interconnected system of central places. The 
density of the economic units is at a local peak within each of these 
central places, but it does not fall off abruptly at anything that can be 
called an edge or limit of these cities. It declines systematically with 
distance from the center, and for the major centers the decline con- 
tinues for a hundred or more miles out, until an area of dominance of 
an adjacent major center is reached; and then population density rises 
relatively smoothl¥ to another peak. Passing along connecting links 
between the major centers, one sees rises to and falls from smaller 
density peaks at sub-centers. Each major center appears as a hub 
with spokes, so to speak, extending to other centers. The sub-centers 
associated with a given major center may be seen as lower order hubs 
with extending spokes. That which appears as a principal central place 
from the point of view of a particular set of centers may appear as & 
subsidiary central place from the point of view of a larger set of places; 
and the hierarchy of centers scales up finally to a nucleus for the entire 
System. 


3 The Structure of the Metropolitan Community—A Study of Dominance and Subdominance, Univer- 
sity of Michigan, 1949, 


сова I. Relief traffic шар,... The height of the traffic bands indicates approximately the average density of traffic to be expected at all points on the system. 
. Bource: Interregional Highways, 78th Congress, 2nd Session, House Document No, 379, Government Printing Office, Washington, 1944, between p. 40 and 41, 
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The connections that link the density peaks into a system consist of 
the transport and communications networks, and the entire configura- 
tion may be visualized without the aid of any sub-area designations 
regarded as economic regions. Figure I affords a rough visual ifipression 
of the configuration covering this nation's area. The densities shown 
are on a scale such that only the major centers are indicated and refer 
to highway traffic rather than to the economic units within the nation. 
The two densities would surely be correlated, however, and this figure 
illustrates what I have in mind by a spatial density configuration of 
economie units. 

The spacing and clustering of economic units constitutes а spacing 
and clustering of complementing economie activities. The density 
peaks represent clusterings of certain types of economic activities, and 
these central places are formed into systems and sub-systems of central 
places. This systematie formation may be rationalized and described 
in common sense terms as follows. Some of the economic units are 
engaged in activities the locations of which are determined by the 
locations of natural resources and thus are dispersed over space. The 
expectation is that each area of substantial size will be endowed with | 
natural resources ranging from those of а type any member of which 
is found in only a very few places to«those of a type the members of 
which are widely distributed among most areas. АЙ areas contain 
alike these latter resources, and each contains some one or more of the 
rarely found resources. The dispersed units are supplied and serviced 
from central places. In the smallest central places the servicing and 

' supplying operations involve activities the products of which have the 
smallest market range. The type of value added in these centers is 
such that there is a relatively large rate of consumption per consuming 
unit and also such that the rate of production of a tolerably efficient 
producing unit is relatively small. Among the units serviced and sup- 
plied are those whose product specializations may consist of outputs 
of that type of resources found only in a few places. The mean dis- 
tance moved from point of origin to destination by the value-added 
embodied in the products of these units is relatively great. From this 
extreme, the mean distances covered by resources and products of ' 
other types range down to nearly zero; and in these small centers the 
proportion of units of value-added shipped that terminate at some 
given distance declines sharply as distance is increased. 

Several of these small central places are grouped into a low-order 
system of places centered upon a somewhat larger place in which are 
found, in addition to the smallest-range central goods and services, 
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other central goods and services with a somewhat larger market range; 
Тһе rate of consumption of these latter central goods per consuming 
unit will be smaller than that of the goods of the smaller range so that a 
larger number of consuming units will be required to take a given rate 
of output. Or the size of the efficient producing unit may be greater than 
that of those typical of the smaller center so that more consuming units 
may be required in order to keep а producing unit going than are 
available within the range of the smallest central place. Also, in this 
second sized central place the chance is better than for the smallest 
place that there will be located there some form of processing requiring 
an assembling of resources not all of which are found within the range. 
of the smallest érder of place. Thus, in these second-order centers there 
are, as in the small centers, economic units engaged in activities pro- 
ducing goods and services with market ranges varying from the rela- 
tively large to virtually zero, and for these centers also the proportion 
of value-added shipped that terminates at some given distance de- 
clines sharply as distance is increased. The two types of centers may 
have similarly shaped distance distributions, and in the short-distance 
range of the scale the types of products and services are the same іп фе 
two distributions. In the longer distance range of the scale the second- | 
order centers will tend to show a somewhat higher degree of product 
diversification. 6 ! 
Each of the second-order central places has its subsidiary elementary | 
centers about it, and this constitutes the lowest-order system of cen- | 
ters. Several of these are grouped about a third sized place to form 8 | 
second-order system of centers. This third-order central place will in- 
clude, in addition to the activities of, units providing all the types af | 
central goods and services found in the smaller central places spaced | 
around it, economic activities providilig central goods and services | 
with ranges reaching out over the entire system of which it is a nucleus | 
or center. All that was said in comparing the respective distributions H 
of market ranges for the two lower-order types of central places may | 
now be said in regard to this third-order place as compared with the | | 
next place below it in the ordering of types of places. A place of tbi. 
third order is more likely to contain processing activities involving ай | 
assembling of resources not found in all areas, and the tendency will || 
for this assembling to draw from a larger area than is the case for th? 
smaller centers. The products sent the longer distances from thes? | 
larger centers will typically show a greater diversification than tho* | 
sent from the smaller centers. Several systems of this higher order 87 
oriented with respect to a still larger central place, and so on; and? 


| { 
ТЫ: | 
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comparison of the situation to be found in any given order of place 
with that characterizing the next lower order of place would involve 
the same kind of generalization. A central place of any given order is à 
source of central goods and services of all the types available in any 
of the lower-order central places within the system of which it is а 
center. In addition it is а source of central goods the market ranges of 
which cover these centers and areas included with this system. And 
in addition to this it is an assembling and processing point for resources 
and products peculiar to the territory covered by its system of centers 
and destined to be shipped to the central places of other systems of 
centers. 

In the above description, the central places are regarded as points 
of dispersion and points of absorption of economie flows. From each of 
these centers, value and physical products and services are continually 
moving outward, counterbalancing the inward flows of value and 
physical products. For a given center the destination points would form 
а density configuration and likewise would the origination points 
for the flows coming into the center. Thus, а second kind of density 
configuration has been introduced in the description of the structure 
of the system. The pattern formed by the spatial distribution of the 
economic units is specified by the first kind. The second kind describes 
the spatial dispersion of (a) the destinations of the economic flows 
emanating from any given concentration of units and (b) the origina- 
tions of the economie flows terminating within any given concentra- 
tion of units. Suppose that for a given central place and time period 
the destination were known for each product or value addition pro- 
duced by the units served by the central place. Then, the distance from 
the center to the destination for each unit of value-added would be a 
variate for which a frequency-distribution could be formed. This dis- 
tribution would deseribe the density configuration of the destinations 
of the flows originating in the center; and in a simildr way there could 
be constructed a distribution describing the density configuration of 
the originations of the flows terminating in the center’ 

As an illustration of what I have in mind as this second kind of den- 
sity function, I am including here Charts‘ 1-7. These empirical distribu- 
tions conform only roughly to the idea, They refer to carload rail 

* These charts were prepared by Jere Clark with whom I am working in the study of these data 
and who is preparing a detailed study of economic flow interconnections as these are indicated by the 
Т.С.С. Waybill data. This work is being done under the auspices of the Bureau of Population and Eco- 
nomic Research of the University of Virginia. This Bureau is also making possible a study by Carl 
Madden pertaining to the development over time of the system of central places represented by the 


cities of this nation and a styly by Carlyle Baskin of the works of Walter Christaller as an illustration 
of empirical work dealing with the spatial structure of a population system. 
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Снавт 1а. Relative Frequency Polygon Showing the 252 Commodity 
Groups, by Average Distance Hauled by Rail, United States, First Qu 
1947, 1948, 1949, 1950. 
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Source: Data derived from LC.C. Carload Waybill Analyses, Statements No. 517 (1949) and 
No. 5180 (1950). 
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Снавт 1b. Cumulative Relative Frequency Distribution of the 252 1.0.0, 
Commodity Groups by Distance Нашей by Rail, United States, First Quartet 
1947, 1948, 1949, 1950. 
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tonnage and not to value-added in all products and services, and the 
originations and terminations are classified by State and not by central 
place. But they are illustrative nevertheless. Charts la and 1b show 
the distributions for a succession of years of the average distances 
hauled for some 252 commodity groups. The distributions approx- 
imately conform to the same logarithmic normal distribution. It can 
be shown that from year to year a commodity group will be found 
within the same general neighborhood of the distance scale, although 
over longer periods there may be expected systematic shifts of com- 
modity groups along the scale that do not necessarily alter the shape 
of the entire distribution. This chart represents a classification of com- 
тойу groups and indicates a degree of stability жі the typing of 
commodity groups in accordance with average distance hauled; but 


Снавт 2. Cumulative Relative Frequency Distribution of Cars of Rail 
Freight Classified by Distance Hauled, All Commodities, United States, 1949, 
and 1st Quarter of 1947. 


(Plotted on Logarithmic Probability Paper) 
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" it does not represent the idea of the distance density function referred 
to in the previous paragraph. Chart 2 presents a different arrangement 
of the data, and the distributions shown are in rough conformance 
with this,idea of distance density function. It is for the United States 
as a whole and shows the proportion of total tonnage terminating 
within any specified distance of the point of origin. The plotting is 
done in such a way that a straight line indicates a logarithmic normal 

Снавт 3. Cumulative Relative Frequency Distribution of Carloads of Rail 
Freight Originated in Alabama and Virginia, by Distance Hauled, Manufac- 
tures and Miscellaneous, 1949, 
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Source: 1.С.С. Carload, Waybill Analyses, Statement No. 5110 (1949). 
distribution and it may be seen that the distribution approximately 
conforms to the same straight line for the periods shown. Two parame 
ters being sufficient for specifying a straight line, this would mean that | 
а knowledge of two parameters is sufficient for the description of the 
distance distribution of tonnage hauled by тай; and these parametel 
were stable during the period shown. i 


i When the national distribution is broken into the component Stale 
distributions, these State distributions are found to be characterize? | 
by essentially the same commodity groups within one end of the 868 
but by wide divergency in t 


he other end, each State typically having | 
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à commodity specialization in the long distance end. Distributions for 
Virginia and Alabama are shown in Charts 3, 4, 5, 6, and 7. These 
charts indicate for given origination points the proportion of traffic 
terminating within any given distance of those points, the erigination 
points being grouped by States. Various ways of plotting have been 
employed. Charts 3 and 4 illustrate the similarity between the dis- 
tributions of the two States. Chart 5, plotted on logarithmie probability 
paper, shows the similarity between the distribution for a State and 


Снлвт 4. Cumulative Relative Frequency Distribution of Carloads of Rail 
Freight Originated in Alabama and Virginia, by Distance Hauled, Manufactures 
and Miscellaneous, 1948, 
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Source: І.С.С. Carload Waybill Analyses, Statement No. 5038 (1948). 
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that for the Nation. Charts 6 and 7 show logarithmie normal fits to the 
distributions of a State, one distribution for the terminations of traffic 
originating in Virginia and one for the originations of traffic terminat- 
ing in Virginia. Six hundred miles is apparently far enough to include 
the bulk of the termination points for rail traffic originating at points 
within a given State and of the origination points for rail traffic ter- 
minating at points within a given State. But these distributions refer 
only to long distance traffic, and a very much shorter distance would be 
Sufficient for including the same proportion of all the value-added, 
Originating or terminating. The relatively stable flow of traffic that 
can be observed over long distances is stable by virtue of a continual 
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Снавт 5. Cumulative Relative Frequency Distribution of Carloads of Rail 


Freight Originated in 1948 and in the United States in 3rd Quarter, 1947, by 
Distance Hauled, All Commodities, 
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Source: L.C.C. Carload Waybill Analyses, Statements No. 4822 (1947) and No. 5038 (1948). 
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adding-to and aropping-out process, the additions on the average going 
relatively short distances before dropping out. 
2 This mode of describing the spatial structure of an economic system | 
oes not make essential use of sub-area designations as analytica | 
concepts. When a particular place is under consideration, the essential | 
matter is its orientation, type, and role within the density configuration” 
or system of which it is a part. Its flow connections are shown by its 
distance distributions—the terminations of its physical and finan ША | 
flow originations and the originations of its physical and fina” i i 
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Снавт 6. Percentage Frequency Distribution of Observed Number of Cars 
of Freight Originating in Virginia Shipped Given Distances, Compared with the 
Number Indicated by the Logarithmic Normal Distribution, by Distance Inter- 
vals, Manufactures and Miscellaneous, 1948. 
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Cuarr 7. Percentage Frequency Distribution of Observed Number of Cars 
of Freight Terminating in Virginia Shipped Given Distances, Compared with the 
Number Indicated by the Logarithmic Normal Distribution, by Distance Inter- 
vals, Manufactures and Miscellaneous, 1948. 
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flow terminations. The basic idea in this description is that of a density | 
function; and that part of theoretical statistics which analyzes the | 
conditions that generate distributions having specified forms is rele- | 
vant in eonsiderations directed to explaining the existence of the par- 
ticular forms of the distributions observed and the processes by which 
these forms have developed. | 
"There would seem to me to be little basis for doubt regarding the pos- 
sibility of constructing a mechanical model that would distribute and | 
from period to period redistribute individual entities among the com- | 
ponent small squares of a large partitioned area in such a way as to | 
illustrate certain aspects of the time process by which economic units | 
spread themselves in the course of many decades over an area such as 
this Nation's. In my opinion the model in conception could be made | 
more realistic than the mechanical models that have been constructed 
in the study of economic time series phenomena. There are specialists, 
of course, who can analyze a situation of this sort to the extent that 
the problem is adequately formulated. The end result would give the 
expected numbers of elementary areas containing 0, 1, 2, 3,---,” 
individuals after the passage of а time interval of some given length, 
and the expected distances between density peaks of specified sizes 
would be indicated. With the continuing passage of time, the system 
would develop at varying rates in its different parts. The analysis of 
the process would throw light upon the expected quantitative charac- 
ter of these divergencies in growth rates. The contribution of this 
would seem to be obvious; for although much of social discussion deals 
with comparisons and evaluations of regional (or national) growth 
tates, there is аб present an absolute lack of any analytical standard 
by which to judge critically the observed divergencies in growth rates. 
Normal” divergency in growth rates and “normal” concentration of 
population—concepts implicit іп the comparisons made in many 1“ 
stances of social discussion—refer to the “expected” divergency and the 
бурай ЖОЕ of individuals in a system that is functioning 
normally.” Analytical knowledge of these latter has yet to be pro 
уш, MA evaluative comparisons that implicitly assume that in? 
normally” working system the different geographic parts of an ec 
nomic system should be developing at approximately the same rates 
or that there is something “unhealthy” about the functioning of à 
system in which some parts are showing negative growth rates ob- 


viously beg the question of the attributes of a “normally” working 
system. 
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LABOR PRODUCTIVITY IN THE SOVIET UNION* 


Irvine Н. SIEGEL 
Twentieth Century Fund, Washington 
I в 

нЕ productivity of labor, especially in “industry,”! is a subject of 

intense interest in USSR. Its measure is described in an official 
handbook as the “most important national economic index,"* and is 
projected with production goals in the directives setting forth the 
five-year plans. The term is probably mentioned more often in Soviet 
popular and technical-literature than in our own.* The concept plays 
& vital role in the "scientific? socialist theory of economic evolution 
and in the advertised Soviet strategy for achieving thé twin objectives 
of full communism and the end of “capitalist encirclement.”* Lenin’s 
statement of this role, repeated by Stalin in 1929, has since become a 
Soviet commonplace: 

In the last analysis, productivity of labor is the most important, the princi- 

pal thing, for the victory of the new social system. Capitalism created a pro- 

ductivity unknown under serfdom. Capitalism can be utterly vanquished, 


and will be utterly vanquished, by the fact that socialism creates a new and 
much higher productivity of labor.’ 


With the adoption of comprehensive « planning" in 1928, a systematic 
program was inaugurated to supply the material basis for achievement, 
of the long-range Soviet objectives. Agriculture has been collectivized, 
and many of its operations mechanized. Industry has been rebuilt, 
expanded, electrified, and otherwise modernized. The urban labor force 
has been enlarged by streams of “surplus” rural population, and its 
skill has been progressively elevated by means of on-the-job and voca- 


,.. * Revision of papers presented at fhe December 1951 meeting of the American Statistical Associa- 
tion and at the April 1952 meeting of the Nef York Chapter. The author is grateful to P. R. Lever, who 
Provided translations of Russign materials іп the course of a study conducted in 1949-51 for the Johns 
Hopkins University Operations Research Office; and to S. Fabricant, W. Galeyson, and А. Gerschenkron 
for their critical comments on the original paper. 

? Includes principally manufacturing, mining, and electric power supply- 

*Slovar Spravochnik po Sotsialno-Ekonomicheskoi Statistike (Dictionary-Bandbook for Socio- 
economic Statistics), 1948, p, 397. 

* There is по justification for concluding, however, that productivity is relatively “underemphasized” 
in U. В. (See S. E. Harris, Economic Planning, New York, 1949, pp. 49-54.) The notion of productivity 
is fundamental to competitive enterprise, but businessmen's decisions concerning it are usually couched 
in the language of accounting—in terms like “unit labor cost,” “unit cost,” and “profit,” Furthermore, 
{he Soviet press, aptly described by Professor Inkeles of Harvard as a “mass trade journal,” reflects by 

spi the state's preoccupation with problems of production and productivity. 

1 On the place of Productivity in Soviet theory, see J. Н. Towster, Political Power in the USSR: 

LI 1947, New York, 1948, pp. 410-11, or А. Vyshinsky, The Law of the Soviet State, New York, 1948, 
2-60. The term “communism” has, since Lenin, been used to describe a “higher” form of “socialism” 
characterised by distribution of output according to need and the “withering away” of the coercive 


* Quoted by J. Stalin, Sdlected Writings, New York, 1942, p. 135. 
65 


66 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1953 


“` tional training. Application of the principle of the “universal obligation 
to work" has meant the employment of many types commonly found 
outside the labor force in our own country. 
Соетсіуе measures, institutional arrangements, and propaganda 
direct the energies of the “workers and peasants"—the new “collective 
masters" of society—toward the prime task of plan fulfillment. The | 
captive trade unions encourage their members to improve skills and 
increase productivity. Western-style unemployment was “abolished” 
in 1930. Absenteeism, turnover, and labor mobility are restricted by 
stern penalties. The regular hours content of the work year has been 
steadily increased since the early 1930's. The so-called “shortest work | 
day in the world” is officially 8 hours long, and overtime is obligatory. | 
The 6-day work week became standard on the eve of World War II. 
The relatively small annual output of consumer goods is distributed in- 
sofar as practicable on an incentive-pay basis. Honors, prizes, and 
privileges are bestowed upon exemplary workers. Even before the 
advent of Stakhanovism in 1935, the techniques of “socialist competi- 
tion” were stressed for advancing backward workers, for generating 
"labor enthusiasm," and for hastening maturation of the socially 
oriented “new Soviet man.” 1 
What has been the impact on labor productivity of these and other 
features of life under Soviet plinning—on the course of productivity 
and the standing of USSR in comparison with advanced “capitalist” | 
countries? To questions such as these the rest of this paper is devoted. 


п 


Tt must be recognized at the outset that certain practical and theo- 
retical difficulties preclude the derivation of altogether satisfactory 
quantitative answers to these questions. Among the practical obstacles 
are the limited quantity, uneven quality, and inadequate documenta- | 
tion of the Soviet data and indexes relating to production, labor in- - 
put, and productivity. Virtually full-coverage statistics are preferred, 
but these are vompiled only as an incident to other operations. The 
personnel involved are generally overburdened, insufficiently trained, 


articles by A. Bergson, C. Clark. М. ears about the peculiarities of Soviet statistic. See, for examples 


York, 1951, pp. 45, 120-21; 
Administrat О 


Observations, Princeton, 1950, р. 13, ref м bly арос" 
t “lie coefficients” 5 , P. 18, refers toa “report”—most probably »P | 
during the early тодо ^ "е "PPlied by the central statistical authority to data submitted 0 
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The pressure to fulfill plan commitments, moreover, encourages es- 
tablishments to distort and exaggerate their accomplishments. Au- 
thorities decry such abuses, but policing of the reporting system remains 
lax. The finished statistics are used for administrative purposes and, 
after proper selection and landscaping, for domestic and foreign propa- 
ganda, too. 

Of particular interest are the limitations of productivity statistics 
for the important “industrial” sector and its components, which are 
called “branches.” The available figures, especially those for recent 
years, refer mainly to output per worker. Since variant, inconsistent, 
and ambiguous figures have been published for some years, the official 
industrial productivity series cannot be stated definitively. Since the 
Suppression of detailed data on output, employment, and hours of 
work began in the mid-1930's, foreign students must generally confine 
any independent reconstruction of the Soviet record to the pre-war 
years. Although Soviet analysts made international productivity com- 
parisons in the late 1930's, no similar estimates have been released 
for the postwar years. In fact, the prewar comparisons are still cited in 
Soviet literature, and 1929 is still erroneously mentioned as the Ameri- 
can peak year. 

The formula used for constructingthe Soviet index of industrial 
output per worker has most probably varied, yet the index is commonly 
Supposed to be a quotient of the composite “gross” production and 
“wage-earner” employment measures.” Test computations made by 
this writer for 1928-35 suggest that some sort of weighted average of the 
productivity relatives for the major branches was used for at least this 
interval A labor-weighted average of branch productivity indexes— 
the kind of formula sponsored jn our own country by technicians of the 
WPA National Research Project and the U. S. Bureau of Labor 
Statistics—was officially computed, together with tke quotient of the 
Composite production and labor measures, for about five years fol- 
lowing Мау 1943.? It is not known which of the alternatives was pub- 
lished for which years, : : 

DL MUS de S Vi PO ex DESC 


Tape, РОМ Чоп somewhat different from the one taken here is maintained by W. Galenson, “Russian 
or Productivity Statistics," Industrial and Labor Relations Review, July 1951, pp. 497-08, and 
Бе Soviet Industrial Productivity" (Rand Corporation, P-276), March 6, 1952, pp. 1-4. 

tributi, wage earner” figures refer to rabochiye. They may exclude apprentices (ucheniki), whose con- 
14900 40 output is, however, included in the productivity numerator. 

(Econo АО, E. L. Granovskii and B. L. Markus, Ekonomika Sotsialisticheskoi Promyshlennostt 
» gres of Socialist Industry), 1940, рр. 475-79. Е 

the 1944 и comments on the concurrent computation of the labor-weighted average in 

preformed by 05, РР. 218-19) but omits reference to it in the 1048 edition, p. 308. This formula was 
period, D, е ing Soviet students but apparently gave disappointing results іп the reconversion 

suddenly ao, eis Kure Promyshlennoi Statistiki (Course in Industrial Statistics), 1049, p. 208, 

information wea cc its loss of official status after several pages of favorable comment—as though the 

Was received near press time, e 2 
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The gross output (valovaya produktsiya) measures which underlie 
the productivity indexes for industry and its components have been 


roundly criticized inside and outside USSR." Various features which | 


clearly léad to an unduly favorable picture of Soviet growth have been 
well publicized—like the inclusion of defective and incomplete goods 
and the incorporation of new products and new models (especially 
before 1937) with weights reflecting the price inflation which occurred 
after 1926-27, the “fixed” base year. But other features which have 
received less attention also inflate the apparent volume of output and, 
at least for some years, also exaggerate the changes in production and 
productivity with respect to the base levels. For example, the output 
aggregates include the cost of materials and other elements of “gross” 
price. The extent of duplication depends, not only on the degree of 
integration and on the reporting practices of establishments, but also 
on the significance of (net) imports of raw and semi-fabricated materials 
consumed in industry. Even if the measure of output were neutral to 
changes in degree of integration, domestic output during the war was 
doubtless overstated (and the American contribution to it under- 
stated) by the inclusion of substantial Lend-Lease shipments of steel 
and other materials. Similarly, the gains recorded in output and pro- 
ductivity after the war reflected in part the inclusion of materials 
acquired through trade agreements and as booty, reparations, and 
shares in the production of joint extraterritorial corporations. 
Another source of inflation of output and productivity is the very 
definition of gross output. The Soviet concept is generally discussed 88 
though it referred only to the “physical” volume of finished goods and 
semi-manufactures, but it also includes an adjustment for inventory 
changes and numerous expenditure items’ presumably reckoned in 
fixed” rubles. Among the latter are: communal and other incidental 
Services, Internal plant additions, maintenance and repairs, product 
development cost, contract work, and certain cancelled orders. The 


opportunities for confusing input with output are obviously abundant - 


ша newly industrializing country which frowns upon markets and 
establishes prices administratively. Soviet writers themselves have 
often criticized tendencies to count all work done by “socialist” labor 
as “productive” ; to improve the apparent performance of an establish- 


B 
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ment by padding the “other business” account; to record defective or 
unwanted goods as output; and to include illegally in industrial pro- 
duction any major construction projects executed with plant person- 
nel.” Trotsky, who anticipated many of the criticisms made by Ameri- 
can students of Soviet statistics, noted particularly the inflation of out- 
put aggregates by the inclusion of repair, so that apparent output 
increases as quality diminishes: “It is not always certain what hides 
behind [the ruble of output]—the construction of a machine or its 
premature breakdown.?!* 

In 1949, current weights were supposed to replace the largely fic- 
titious 1926-27 prices;* and other changes had been made earlier. 
But the industrial output and productivity indexes rémain subject to 
idiosyncrasies of the gross product concept and will presumably re- 
tain much of the buoyancy so desirable from the standpoint of propa- 
ganda. 

It would be a mistake, however, to suppose that Soviet planners and 
analysts have been misled. In 1936, industrial production statistics 
based on another concept (lovarnaya produktsiya) were introduced, 
but not made available publicly. These figures have apparently been 
used in planning since 1940. The concept is restricted to goods in 
“finished” or “tradable” form," and the prices are current. 

The Soviet gross output definition and measurement practices 
doubtless lead to industrial production and productivity indexes which 
Tise more spectacularly than conceivable alternative indexes computed 
by the most preferred Western methods. Conversely, the application 
of the Soviet concept and methods to American data should lead to 
indexes rising more steeply than the standard U.S. series. Western 
practice would favor the “finished” product concept in lieu of “gross”; 
a chain index for branch output (with or without adjustment of the 
links for deficienciés of coverage) in lieu of a fixed-base index with 
Poi nt ox госепі evidence of dissatisfaction with accounting practices, see A. Arakelian, Industrial 
quan БЕУ р poet, Washington, 1950, especially p. 142; M. My “The Planning of 

ж Le Troteky, a Б д, ШУ 1952, рр. 78-94; and New York Times, August 4, 1952. 

ИТ. А. Bholomovich, Analis Калоо ое РРР ИЧЕ iat 
of Economo 0019120710, Analiz Khosyaistvennoi Deyatelnosti Promyshlennogo Predpriatiya (Analysis 
(Neo pu enon of the Industrial Enterprise), 1949, p. 29. According to a recent FET 
тс a 
Procedure i cason, prices, and new items will be introduced at comparable 1952 prices. 
xe d with the 1949 change from 1926-27 weights, and hence does not appear to be а 


dud See, for example, P, Kholodnyi, “Planirovaniye Tovarnoi Produktsii" (Planning of Goods Pro- 

112-13) Балау Khozyaistso (Planned Economy), 1940, No. 4, pp. 48-52; Savinskii, op. cit., pp. 

production 8 halterakii Uchet, рр. 308-12. Gerschenkron's assumption that the “gross” and “finished 

complaints ва аге close seems untenable in view of the abuses permitted by the former concept and 

Soviet Мако ОЕ technieians leading to tho introduction of the latter, Bee his А Dollar Index of 
toMhery Output, 1087-28 to 1937 (Rand Corporation, R-197), April 6, 1951, pp. 6-7. 
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b pseudo-price weights assigned to new products and models; and, in the 
event a fixed base is desired, a recent weighting pattern in lieu of a very 
early one. For industry as а whole, Western practice would favor the 
combination of branch output indexes with net (i.e., value-added or 
employment) weights, but this preference may not critically affect the 
results. On the other hand, the characteristic Western treatment of 
nonhomogeneous output like machinery—either exclusion from the 
industry total or incorporation by means of а coverage adjustment— 
could well lead to understatement of the true rates of Soviet industrial 
output and productivity growth. 

Finally, some remarks on the theoretical limitations of the index- 
number technique are in order. Even if complete, ideal data were 
available, it would still be impossible to construct unique or universally 
acceptable temporal indexes for USSR or any other industrializing 
country undergoing fundamental structural change within a brief 
period. Index-number statisticians in “mature” Western countries are 
Hin not so concerned as they ought to be over this important 
act. 

What applies to temporal indexes also applies to “spatial” ones— 
to international comparisons of multiproduct industry branches or of 
industry as a whole, Unique or universally acceptable measures cannot 
be derived for countries which differ significantly in technology, tastes, 
price patterns, and assortments and specifications of products. Diffi 
cult practical problems of valuation also arise when the countries do 
not engage in substantial trade with each other or with a common 
third country. Tn these circumstances, there is a temptation to over- 
state the significãnce of comparisons restricted to industry branches 
with pseudo-homogeneous products, like “steel” or “coal.” 


ш 


When the most//avorable Soviet claims concerning industrial output 
per worker are pieced together, a picture of remarkable growth emerges." 
The 1950 figure was supposed to be 37 per cent above the 1940 level 
and the 1951 figure a modest 10 per cent above that for 1950. The 1950 
figure was also supposed to be almost 5 times the 1928 index, almost 

зе The series discussed " i 
sss Catra- Industrial Aeon UES, р Я арче 
ОЕ О ЛЕШ 
al ais етене ты ives and for the relation of 1950 to 


The figures for the earlier years refer to “large-scale” mini- 
industry, to establishments meeting the 

mum employment or power criterion set up the Wi viet in* 
dustry, the criterion has diminished in importance and оо, ТИВ C E e ої Во 


and appears finally to Have been abandoned 
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7 times that for 1913, and almost 10 times that for 1900. Setbacks аге“ 
implied for 1940-41 and for the early reconversion period; but none 
is now shown for any year during the first plan period despite con- 
temporary evidence. The series available for individual branches for 
‘the years prior to World War II generally indicate above-average 
gains for producers’ goods and below-average gains for consumers’ 
goods. 

| Available statistics for output per man-hour show gains of a similar 
order, except that the rise during the first plan period (1928-32) is 
sharper and the rise in the third plan period (1938—40) not so steep.” 
If man-hour productivity figures were also reported for the postwar 
period, they would doubtless show a smaller advance than 37 per 
cent beyond the 1940 level because the hours content of the work year 
increased. 

"Two variant estimates of output per worker derivable for 1950 are 
somewhat smaller than the figure cited above but are, nevertheless, of 
the same order. One, based on a lower 1940 level plus the 37 per cent 
gain for 1940-50, is 4.5 times the 1928 іпдех.!* The other, based on 
reported gains of 41 percent for the first plan period, 82 percent for 

the second, and 32 per cent for the third (plus the 37 per cent gain for 
1940-50), is 4.6 times the 1928 index.!? 

The average annual increases claimed for Soviet worker productivity 
are, of course, much greater than the 2 per cent lorig-term rate com- 
monly cited for our own country. Furthermore, no decisive slackening 
of the rate is evident through time. During the half century 1900-50, 
the claimed average annual rate was 4.7 per cent. For the period since 
the eve of World War I, 1913-50, the average rate was 5.2 per cent. 
For the period since the jnauguration of planning, 1928-50, the average 

. was 7.5 per cent. For the first three plans, 1928-40, the remarkable 
average of more than 11 per cent is indicated. The rate for the fourth 
plan, 1946-50, is even higher. Though reattainmert of the 1940 level 
Was not claimed until 1948, an index 37 per cent above that level was 
Supposed to have been reached by 1950! The directive for the present 


ЖЕ ао аса es se S 
7 

-..," Based principally on SSSR i Kapitalisticheskiye Strany (USSR and Capitalist Countries), 1980, 
2.76, and B. L. Markus, “The Stakhanov Movement and the Increased Productivity of Labour in the 
USSR,” International Labour Review, July 1936, р. 7. 

8.1 The lower 1940 level (3.25 times the 1928 index) is established on the basis of Maslova's 1937 
‘figure and the claim of a 32 per cent increase to 1940. Our higher figure is based on Turetskii's possibly 
correct application of this percentage change to the 1938 figure and his (consistent) assertion that the 

Vindex was “more than 3.5 times" the 1928 base (we used 3.6). 
ne, These percentage gains for the first three plans, which vary somewhat from other published 
‚ага still shown in*the 1952 article cited in footnote 12. 
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‘five-year plan, 1951-55, sets a target of “approximately 50 per cent” 
for the entire period,?? or about 9 per cent per year. 

In view of the remarks made in the preceding section, these figures 
overstate the degree of Soviet progress. Starting from a more primitive 
productivity level, USSR should have exceeded the average American 
long-term growth rate—especially since the welfare implicit in our high 
levels of civil liberties, leisure, and consumer goods output does not 
enter into the measure of production. On the other hand, measure- 
ment according to best Western practices would probably have indi- 
cated no important gain in Soviet output per worker for the first plan 
period as а whole; a substantial gain during the second and third 
periods, say to about 75 per cent above the 1928 level by 1940; and 
reattainment of the 1940 level by the end of 1950 rather than a signifi- 
cant rise beyond it. 

The claims of substantial growth in output per worker during the 
first and fourth plans—8 per cent per year and about 13, respectively 
—are unconvincing even though the productivity potential was cer- 
tainly raised during those years. The rise in the potential during the 
first plan was partly realized in the subsequent years, as the new and 
renovated plants were utilized more fully and as labor acquired the 
new “habit of work” and gained experience. In both periods, the influx 
of workers into the nonagricultural labor force exceeded the planned 
increase by milliéns. Such large additions of inexperienced workers 
would seem incompatible with sharp advances in marginal productiv- 
ity if output were measured by Western methods. The first plan period, 
moreover, was very unsettled; it was marked by terror in the agricul- 
tural sector, elimination of unemployment compensation, evisceration of 
trade unions, inefficient management, widespread “wrecking,” and ac- 
knowledged deterioration of output quality. Output per worker should 
also have been unfavorably affected by the alleged reduction in official 
hours of work and їа the number of work days per year. Obstacles to 
true productivity growth in the fourth plan period included poor utili- 
zation of the swollen labor force, the reconversion decline in compul- 
sory overtime, and the demoralization incident to the continuation of 
austerity and the loss of wartime "savings" through currency reform. 
The official claim that 1950 production exceeded the 1940 performance 
by 78 per cent doubtless reflected the inclusion of inflationary items 
of the kind already mentioned— e.g., material imports, the rehabilita- 
tion of war-ravaged plants and (illegally recorded) major construction 
by industry personnel, and the payment of wages to workers retained 

20 New York Times, August 23, 1952. 
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during reconversion for retraining. Measured productivity might also. 
have been favorably affected by asymmetrical treatment of the output 
and the employment of war prisoners and penalized Soviet citizens en- 
gaged in industry. © 

Among the productivity estimates devised by foreign students for 
Soviet industry, those of Colin Clark for 1913-36 are probably the best 
known.2! These series differ conceptually from the official productivity 
index; they purport to show net output (in constant American dollars 
of 1925-34) per employed person (wage earners and others) and per 
corresponding man-hour. Between the terminal dates, output per em- 
ployed person supposedly advanced only one-third, while output per 
man-hour increased by three-fourths. Between 1928 and 1936, the rise 
in both measures was only about one-fifth. Clark takes no direct ac- 
count of the new products introduced after 1928; he relies on a small 
list of products which could reflect the change in the total only by co- 
incidence. Curiously, he estimates a gain of more than two-fifths in 
both series for 1935-36, the first year of Stakhanovism—a gain about 
twice that claimed by the official Soviet index of output per worker. 

Clark has also prepared the longest series available on Russian real 
national product per man-hour.” This series refers to the Western con- 
cept of national product rather than to the Soviet concept, which is 
much less inclusive. It is expressed in American dollars of the period 
1925-34 and covers selected years in the interval 1900-47. The figures 
necessarily show little benefit from the shift of labor from agriculture 
to industry, since the denominator omits “disguised unemployment” 
and women in agriculture. Only in 1940 did man-hour productivity 
exceed the 1913 level—and then, barely. A net gain of only one-eighth 
is indicated for 1928-40. A serious decline is shown for the first plan 
period. The 1940 level had not been reattained by 1947. 
i Jasny's figures for agriculture, which presumably do not exclude , 

disguised unemployment," suggest a modest гів® of 28 per cent in 
net output per farm worker between 1928 and 1940.” The rise, if any, 
1n net output per man-hour was smaller; the increase in the number of © 
work days was probably not offset by the reduction in hours per work 


төз eren of Economic Progress, September 1049, p. 1, or Conditions of Economie Progress, London 

1951, р. 277. See latter source, рр. 185-87, 254-55, for details of the net product index, especially the 
CAES was used in interpolation for early years and extrapolation for 1929-36. 

cursed бейіш 07 Economie Progress, April 1049, p. 2. The treatment of agricultural employment is dis- 

t ed in Conditions of Economic Progress, p. 190; and the productivity series is presented up to 1940, 

ind with various underlying series, on p. 191. 

Tn dim N. Jasny, The Socialized Agriculture of the USSR: Plans and Performance, Stanford, 1949, 
- 420, 676, 714; J. A. Kershaw’s review of this book, American Economic Review, March 1950, especi- 


ally pp. 183-84; М д К 
1952, pp. iii and Kershaw, “Soviet Agricultural Prospects” (Rand Corporation, P-278), March 7, 
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E day. Since the 1932 level of farm productivity was doubtless much 
lower than that for 1928, the advance beyond the later year was much 
more pronounced. 

Finally, some increase in worker productivity during the planning 
era is also indicated for rail transportation.“ Freight and passenger 
traffic per “direct” employee in 1932 was 87 per cent above the 1913 
average. The increase between 1928 and 1932 must have been of a simi- 
lar order. A further substantial gain during the second plan period 
brought the index to 2.7 times the 1913 figure by 1937. The sizable 
gains during these years reflect the increase in capacity utilization un- 
der the same kinds of pressure which elevated American railroad pro- 
ductivity at a still more impressive rate during 1939-43. Soviet rail 
productivity has been virtually static, however, since 1937, except for 
the wartime setback. The 1950 goal called for an increase of only 8 
per cent above the 1937 or 1940 level, but a rise of only 2 per cent was 
achieved. In view of the increase in the work week, the 1950 level of rail 
output per man-hour was doubtless below the 1940 rate. 


IV 


Despite the defects of the available statistics and the numerous pit- 
falls of international comparison, there can be no doubt that USSR 
lags far behind U. S. in productivity. In the final section, it will be sug- 
gested further that the gap cannot easily be closed even though the dif- 
ferential might be narrowed. 

Soviet researches conducted shortly before World War II concluded 
that industrial i per worker in 1937, calculated in 1926-27 rubles, 
amounted to two-fifths the American figure. A slightly higher ratio, 44 
per cent, was claimed for output per man-hoar. The ratios for output 
per worker and per man-hour were both supposed to be about one- 
fourth for 1932; and about one-sixth and one-fifth, respectively, for 
1928. USSR had süpposedly caught up with Great Britain and Ger- 
many in output per worker by 1937, although rates only half as large 
were claimed for 1928,5 

These figures might be challenged on various grounds (e.g., the plaus- 
MAI CO MEER A "с 
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ibility of the dollar-ruble conversion rates). But, even if they are con- 
ceded, the gains they show are not remarkable. Comparisons made in 
1926-27 rubles should show Soviet accomplishments in a more favora- 
ble. light than comparisons expressed in dollars or sterling." Further- 
more, the years 1932 and 1937 were depressed ones for U. S. and West- 
ern Europe. Finally, a different perspective is obtained if а 1908 esti- 
mate by Lenin, cited in other contexts by Soviet authors, is juxtaposed 
to the figures for the planning era. According to Lenin, the productiv- 
ity of the Russian industrial worker was already 30 per cent of the 
American in 1908.27 Since this percentage is higher than the 1928 and 
1932 proportions, the alleged gain between 1928 and 1937 largely rep- 
resents the recovery of lost ground. An advance from 30 per cent in 
1908 $о only 40 per cent in 1937, a span of 30 years, would seem like 
slow progress indeed. 

Other computations made outside USSR also indicate a large Ameri- 
can-Soviet industrial productivity differential. According to Colin 
Clark, Soviet net industrial output per man-hour (all personnel) was 
less than one-fourth, and net output per employee was about one-fifth, 

_ the corresponding American rates in 1936. These relationships, accord- 
ing to Clark, were less favorable than in 1913. Another set of esti- 
mates,?* used in American government circles during the war, placed 
Soviet industrial output per man-hour in 1935-38 at 36 per cent of the 
American rate—a relative performance comparable to the British, 
slightly inferior to the German, but superior to the Japanese. Accord- 
ing to the same source, Soviet productivity in munitions-making was 
39 per cent of the American rate in 1944—almost equal to the British 
Percentage, below the German, but substantially above the Japanese. 

There also are some postwar estimates, the foundations of which are 
even less firm. Included among these is a low figure of one-fourth or 
one-fifth the Americán industrial output per man-hour,” a median fig- 
Ure of 40 per cent the American industrial output per worker, and a 
broad Tange of one-fourth to three-fourths the American output per 
Nonagricultural worker, 

п бза Computations of S. Yugenburg in Planovoye Khosyaistoo, 1987, No. 3, рр. 52-54. 
ditelnoet T В. L. Markus in a volume issued by the Soviet Academy of Sciences, Proigno- 
m Wat ослом SSSR (Labor Productivity in Soviet piste. ДМ p. D а 
Output in. World War II, E" iy Fue Es КЗЫ Cah hs И 
?! D. В, Shimkin, “Wh шагу Afairs, Spring 1946, p. 79. y 3 hay eer 
1050, p. 35, and “шы nii Russia's Industrial Strength? П," Aulomotive Industries, August 10, 
9! Galena PM 8 Industrial Expansion, Fortune, May 1951, p. 107. д ў 9 
individual in 8 Mrd Ded on an examination of 1937-39 Povietsimerieun. differentials in 
‘ving output changes measurable in “physical” units; and on probable 


А E Soviet and Amerigan productivity since then. 
"o Кеба, "The Economio War Potential of the USSR,” American Economic Review, May 
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Tn agriculture, too, U. S. enjoys а decisive productivity advantage. 
Before the war, according to Jasny,” over 4 times as many man-days 
were required per acre of grain in USSR as in U. S., almost 4 times as | 
many pér acre of cotton and potatoes, and about 6 times as many per 
acre of sugar beets. Over 6 times as much labor was expended for the | 
same quantity of milk. Volin notes that the gamut of operations on 
winter grain required 2.5-3.6 man-days per acre in 1937 in the highly 
mechanized South Ukraine and North Caucasus, while the average 
direct American labor requirement for wheat was one man-day orless | 
This differential in favor of U. S. would be increased if management 
and other overhead labor were included in the caleulations. The per- 
sistence of a sizable American advantage in the postwar period is indi- | 
cated by the sharp advance in American agricultural productivity since 
1939 and by continuing Soviet complaints against such irregularities 
as the employment of superfluous labor and the maldistribution of per- 
sonnel on collective farms. 

When statistics for the whole of productive activity are examined, 
USSR again appears to be hopelessly outclassed. Clark’s series for net 
national product per man-hour, as we have already noted, omits “dis- 
guised unemployment” in agriculture, so it tends to favor USSR. 
Nevertheless, his figures indicate that Soviet productivity amounted 
to one-third the American in 1900, one-fifth in 1928, less than one-fifth 
in 1940, and less than one-eighth in 1947. Both before and after World 
War II, according to Clark, British and French productivity 8150 
greatly exceeded Russian levels.* When United Nations figures for 1949 
national income in American prices?5 are related to estimates of labo! 
force, we find that Soviet income per labor force participant was only 
one-sixth the American—about $600 compared to $3500. The Soviet 
figure was only one-third the British and two-thirds the French. 

Tt may be superfluous to add that Soviet per edpita output and con 
sumption standards also lag far behind the American and, in many i 
stances, Western European. USSR clearly has a long way to 50 ig 
fulfill the “cardinal task,” as Stalin has designated it, of outstripping 


3 Jasny, The Socialized Agriculture of the USSR. 
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speaks of future development as a “gradual transition” from “social- 
ism” to “communism” requiring a vague number of five-year plans.? 


У 


e 
What about the future course of Soviet productivity and the future 
of the Soviet-American productivity differential? USSR may reasona- 
bly be expected to make gains in both respects in the years to come, 
but rapid progress seems unlikely. There are restraining factors within 
the very process of broad economic development, even (especially?) 
under “planning”; in the initial conditions of industrial backwardness 
and of surplus rural population; and in the incompatibilities of Soviet 
objectives, ideology, and institutions. e 
Even under ideal circumstances of abundant entrepreneurial abil- 
ity at all administrative levels, time would be required for the devel- 
opment of a high productivity potential and of efficient techniques for 
realization of this potential. Time is needed at best to expand a na- 
tion’s supply of materials, power, plant and equipment, and adequately 
skilled labor. It is also needed for the establishment and maintenance 
of a dynamic, more or less harmonious balance of these expanding fac- 
tors. In USSR, it will take a long while to achieve the timely replace- 
ment of worn-out machinery, the near-optimum utilization of equip- 
ment as it becomes available, the specialization of establishments in 
lieu of excessive integration, the smooth redistribution of industrial (or 
agricultural) personnel made redundant by mechanization and auto- 
matization,?8 
, The way in which the surplus labor problem is handled will have an 
Important effect on the rate of Soviet productivity growth. The theory 
of "universal obligation to work," the fact of rapid shift of labor from 
agriculture, the absence of unemployment in the market sense, the 
Virtual ban on discharge of. personnel, the toleration of labor hoarding, 
3nd the frozen status of workers contribute to the accumulation of 
Poorly used reserves in industry as mechanization proceeds. There is 
no ministry responsible for combing out and. redistributing such re- 
Serves across ministerial or geographic boundaries—except, in a sense, 
the Police, The existence of these reserves discourages the mechaniza- 
dn ашу activities like loading, unloading, internal transporta- 
; Mnd Inspection; it also discourages the most efficient use of mech- 


СА pas this “gradual transition” is listed among the duties of communists in the new Soviet 
inconclusive confer est, New York Times, August 21, 1952). In June 1950, an important though 
(translated терр ence was held on “Means of a Gradual Transition from Socialism to Communism” 
атаар аа Current Digest of the Soviet Press, February 24, 1951, pp. 2-9). 

Practice of econo ann аі the dlirective of the new five-year plan includes a resolution “to condemn the 
ашан, jo organizations which underestimate the tasks of introducing new technology and mech- 


Snot aborand which permit the incorrect use of manpower" (New York Times, August 28, 1052). 
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guised unemployment in industry will serve as a brake on productivity 
advance, and the productivity potential will be far from realized. 


ian dictatorship, weaken control over the Population, and hamper de 
velopment of the Soviet economy along present lines. The first courseis 


‘tion of the present Soviet way of life, | 

Tt is unlikely that “labor enthusiasm”—what Stalin called the “zeal 
of the millions" will have а decisive influence on the Soviet produt- 
tivity level. There is abundant evidence that the Russian worker is not | 
being remolded under heat and Pressure into a “Soviet man” who is 
more productive than an American worker responding to money in- 
centives. In fact, it appears that Stakhanovism, the most significant 


plore the "chasé after the Tuble” as a “capitalist remnant” that must 
be eliminated. but they also extol the “socialist” system of piece rates 
and even extend it to prison labor, 

Vhile USSR struggles to raise productivity, our own should continue 
to rise at a modest tate. A mechanical extrapolation—based on a growth 
of 2 per cent per year in American output per worker, 4 per cent fot 
USSR, and a Soviet-American Productivity ratio of two-fifths in 1950— 
would lead to a ratio of only three-fifths by 1970.39 
m Lenin's promise of victory through Superproductivity seems impos - 
sible of fulfillment. But Lenin also mentioned other routes, and USSR 


has certainly improved its world position despite inferior productivity: 
7 томына TS op 9695, аА АА 


alternative projections, including this ono. 


ON PROBABILITIES IN BRIDGE 


Dan Е. Wavan,* е 
The National City Bank of New York 


AND 


ЕвЕрЕвтск V. Wavan,* 
U. S. Bureau of Agricultural Economics 
In the game of bridge, the probability of а successful 
finesse, ora favorable *break" in a suit, depends not only upon 
the original deal, but also upon all the bids and plays which 
have been made by one's opponents. The widely quoted a 
priori probabilities, based solely upon the deal, are not gen- 
erally correct, and are often misleading, after play has begun. 
This paper shows that the evaluation of probabilities during 
the course of play of bridge requires the use of the Bayes 
formula. Two examples are given to illustrate the proper ap- 
plication of that formula to the measurement of probabilities 
in typical situations arising in bridge. 
Pocs of probability arising in the course of play of contract 
. bridge are mentioned in a highly useful and entertaining book [1] 
written by one of the foremost player in the country. He writes: 
Probability problems in the play of the cards are of two®general kinds: 
1. Who holds a particular card? 
2. How are the cards of a certain suit divided? 


The answer to any such question depends on the amount of information you 
have. Before the play begins, you may make an estimate of the odds; but 
аз you acquire information during the play, your estimate must, be revised. 


Тһе author does not go on to*suggest a method by which the revision 
could be made. e 

The writers of the present article follow Borel [2] m suggesting that 
the Bayes formula be used to evaluate the revised proþabilities. Borel 
uses somewhat artificial examples not taken from actual play. We shall 
analyze two examples, both taken from actual play. 


THE BAYES FORMULA 
А general exposition of the Bayes formula may be found in Uspensky 
[3]. We shall consider its application to bridge, using & notation suited 
d Munt purpose. We are concerned with certain probabilities to be de- 
Tmined after the bidding is complete, and play begun. We shall as- 
БРТ a о e soci ie ops vo nono OC TP 


* The authors n - 
many practical кисын (0 Albert Н. Morehead, bridge analyst of the New York Times, for 
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sume (as in most bridge columns) that South is declarer. After play 
has started, South knows at least one, and perhaps others, of the 26 
cards originally dealt to his opponents, East and West. He also knows 
the 13 сауд in his own hand, and the 13 cards held by dummy (North). 
Often he is able to place with one opponent or the other, cards other 
than those which have actually been played. ЇЇ, for example, West 
chooses to lead a Queen of a suit not bid by East, South may well as- 
sume that West also holds the Jack, and probably the 10 also, of that 
suit. At some stage of play, there are n unplaced cards, г held by West 
and n—r held by East. These n cards could be partitioned between 


West and East in any one of ”) different ways. Each of the possible 


partitions could be defined by naming r cards in the West hand, since 
if West held these r cards, East would necessarily hold the other n—r 
unplaced cards. In general, South will need to determine the probabil- 
ity that West's actual holding belongs to a particular group (for exam- 
ple, the group of possible partitions in which West holds the King of 
spades, or exactly three diamonds). 

Let West’s possible holdings be classified into the following m ex- 
haustive and mutually exclusive groups: 


(1) ha, hè, +++, hm, 
and let the capital letters, 
(2) Hy Hs, +++ , Hn, 


represent the numbers of possible partitions belonging to each of these 
groups. In general, .H; can be computed by well-known methods of 
combinatory analysis. The total number of possible partitions is 


м-н, = ("). ә 
4-1 T 


If these N possible partitions are considered to be equally likely, the 
Probability that West’s holding belongs to hy is H,/N. This is, how- 
ever, a special case. In general, the several possible holdings are no 
equally likely. In a sense, H;/N is the a priori probability that West's 
holding belongs to h+, measured after play has begun but before taking 
account of the strategy followed by East and West. But Н.,/У is quite 
different from the kind of a priori probabilities which are quoted in all 
bridge books and bridge columns ; where the probability of any finesse 
is always 3, and the probably of a 3-2 break in any 5-card suit is 
always 0.67826. The a priori probabilities that але” universally quoted 
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are based upon an assumed random distribution of 26 cards; 13 to 
West and 13 to East. The usually-quoted figures are not even accurate 
measures of a priori probabilities after play has begun. 

Let S represent the particular series of bids and plays which have 
been made by East and West, and let 


(3) (S, Ну), (S, H3), +++, (S, Hm) 
be the conditional probabilities that S would occur if West's holding 
belongs to hı, he, * + * , Am, respectively. 


"Then, if the hand were played N times, the expected frequency of 
S from holdings belonging to Л is Нь(5, Hx); and from all holdings 
is У? 1845, Н). After S has oceurred, the inverse probability that 
West's holding belongs to Л, is 


@ (н„;ву = 208000. 
У; HS, Hj) 
i=l 
and the inverse probability of each group of holdings is 
Н.(8, H1) Н.(8, Н) Н,(8, Hm) 
© — LLL ооа 
> H(S,H) ОН(8,Н) >; HS, Н) 
= 4-1 — 
In our analysis, we find it convenient to prepare a table, such as 
1 2 3 4 5 
hy Hi (S, Hi) AS, Hi) n, S) 
Rar ИН (SS НУ) Н.(8, Н.) (Ha, 8) 
6) DELI МАКЕ PAESI CRY qe DE Ur totius e 


hm Hm — (S Hn) Н.(8, Hy) (Hm, S) 


È AAS, Н) 

del 
The first column defines each of the m groups of partitions. The fre- 
quencies in the second column are computed. The conditional prob- 
abilities in the third column must be estimated on the basis of South’s 
Observation of the habits of his particular opponents or of the normal 
Strategies of bridge players generally. The fourth column is computed 
from the data in columns 1 and 2, and the sum shown below. The fifth 
column exhibits the inverse probabilities defined in (4) and (5), and is 


е e 
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obtained by dividing each entry in the fourth column by )-%.,H,(S, Н). 

Example 1. One of the leading bridge players in this country is 
Mrs. Helen Sobel. In her book published a few years ago [4], she de- 
scribes, Without giving the bidding, a hand which she played at a 
contract of six spades. Her opponents' cards were: 


spades 09864 
hearts 010987642 
diamonds J8765 
clubs AJ987632 


West’s opening lead was the 4 of hearts. Mrs. Sobel played the Jack 
from dummy, and it won. She took that as placing the Queen of hearts 
with West. At trick 2, Mrs. Sobel played the Ace of trump, West and 
East following suit with the 4 and 6, respectively, East won trick 3 
with the Ace of clubs, so that in order to make her contract, Mrs. Sobel 
had to win all the remaining tricks, East led a small diamond to trick 4. 
He has no reason to credit his partner with a possible trick in trumps. 
He can see that the offense has command of hearts and clubs. Di- 
amonds will appear to him the least of several evils. South wins the 
trick in dummy, and leads the deuce of spades. East plays the 8. 
South now might play either the Jack, hoping that East holds the 
Queen, or the King, hoping that of the two spades not yet played, 
West holds the Queen and East the 9, The “percentage play,” accord- 
ing to a well-known writer, was the finesse. Mrs, Sobel, however, 
m the King; the Queen dropped; and she went on to make her 


On what grounds is the finesse described hs the “percentage play” 


ve the grounds that as of the point of completion of the deal, there were 
6 ; 


13 =10,400,600 possible partitions of the 26 East-West cards, 13 
to each; and thbht of these (33) =5,200,300 were partitions in which 


East was holder of the Queen of spades; and only: (9 (1) =1,410,- 
864 were partitions in which West was the holder of just 2 spades in- 


cluding the Queen. The ir corresponds to the fact that all West’s 


13 cards are selected in this case from the 25 cards other than the Queen 


of spades, which East and West hold between (һеш. The () а) 
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corresponds to the fact that West іп this ease holds with the Queen, Е 
one of the 4 other spades, and his remaining 11 cards are selected from 
the 21 cards other than spades, that the two partners hold. Hence the 
а priori probabilities as usually reckoned were 5,200,300/10,400,600 
or 50% that East was holder of the Queen of врадев, and only 
1,410,864/10,400,600 or 18%, that West held the Queen doubleton.! 

That was, in Jacoby's language, “ап estimate of the odds, before the 
play begins." It was after Hast’s play to trick 5 that Mrs, Sobel had 
the decision to make, whether to play the Jack of spades from her own 
hand, or to play the King. Her preliminary estimate, she is counselled 
by Jacoby, must be revised in the light of the information that she 
has acquired in the course of play thus far. At this point, she was 
able to place the East-West cards as follows: 


West East Unplaced 
spades 4 86 Q9 
hearts 04 2 109876 
diamonds 8 6 J75 
clubs 6 A J98732 


We classify West’s possible holdings into four groups, according to 
the spades held, and analyze the probabilities as indicated below: 


West's spade v) 
Set holding H (8, Н) H (S, H) (H, S) 
м 094 3003 1.0 3008 0.467 
м 94 3432 1.0 3432 0.533 
т 94 8432 0.0 попе 0.000 
м 4 3003 0.0 none 0.000 
N -12870 DFAS, Нь) «6435 


Mrs. Sobel explains in her book that she had frequently played with 
the holder of the West hand. It had been her experience that he habitu- 
ally made an opening lead from worthless trumps in preference to 
leading away from a suit honor. The opening trick in this hand showed 
that he had led away from the Queen of hearts in this case. The second 
trick showed that he held the 4 of the trump suit which he could have 
led instead. Mrs, Sobel concluded that he would have led away from 
the heart Queen only as an alternative to leading away from the spade 
Queen. Those are the only promising cards that East and West have 
between them, except for the Ace of clubs which East played to trick 


1 We do not say that this is the correct way to compute a priori probabilities. Doubtless one should 
consider the cards already played or placed. But the probabilities shown are those usually used in texts 


and articles on bridge. 
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? 3. Mrs. Sobel in effect rated the East-West plays thus far as sub- 
stantially certain for amy partition belonging to set hı or hz, and as 
practically certain not to have been made from any belonging to set 
ha or №. Qn those assumptions, the finesse is practically certain to fail, 
and the play of the king has а better than even chance to succeed. 
Using equation (4), the probability that West originally held the spade 
9 4, or the lone 4 is |Н.(8, H3) 3- H«(S, Н.)1/У2..94:(8, H)=0. The 
probability that he originally held the Q 4 is Н.(5, H3)/23-1H«(S, 
Н) — 0.533, or 53.3%. 

"Thus the “estimate of the odds, before the play begins,” and the 
estimate revised according to information acquired during the play, 
lead to opposite conclusions: 


a priori revised 
chance of successful finesse 50.0% Nil 
chance of dropping Queen 13.6% 53.396 


Example 1 illustrates a case in which one of the alternative lines of 
play is certain to fail. Declarer obviously should try the other alterna- 
tive, however low the probability of success. Thus, the probability of 
53.3% that declarer can succeed in dropping the Queen is of only aca- 
demic interest. It is shown to indicate that the correct probability 
differs substantially from the a priori probability, which bridge experts 
use to find the “percentage play.” More important is the conclusion 
that West, rather than East, is certain to hold the Queen of spades. 
This conclusion can be reached only by an analysis of inverse prob- 
ability; whether a formal analysis, such as that above, or an informal 
analysis, such as Mrs. Sobel made. 

Example 2. Another leading bridge player, Mr. George Rapee, de- 
scribes in a newspaper article [5] a match played in the Eastern States 
contract bridge championships in March 1952. 


The bidding was: 
South West North East 
2 по trump | pass 3 diamonds pass 
3 no trump pass 5 no trump pass 
6 no trump pass pass pass 


After 8 tricks of play, South had been able to place the East-West 
cards as follows: 


West East Unplaced 
spades Q7652 98 
hearts J 10 3 2 95 
diamonds J982 K P 


clubs 3 542 0976 


| 
| 
| 
2 


v 
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€ 
The 6 unplaced eards must be distributed 1 to West and 5 to East. 

Declarer has lost one trick, so must take the remaining five tricks to 
make his contract. The lead at this point lies in dummy. South can 
make his contract by leading hearts if West’s one unplaced сага is the 
Queen of clubs. Or he can make it by leading clubs if West’s unplaced 
card is one of the two hearts. West’s original heart and club holdings 
may be classified as follows: 


Wests’ holdings 


Set ----------- (8, Н) H (S, H) (Н, 8) 
Hearts Clubs 
h J 10 9 3 0.2 94 0.048 
м 1105 3 0.0 0. 0.000 
hs J 10 Q3 1.0 1.0 0.238 
м 110 9з 1.0 1.0 0.238 
№ J 10 78 1.0 1.0 0.238 
hs J 10 63 1.0 1.0 0.238 


E 
to 


Success with the club lead depends only on finding the Queen of clubs 
with East. As of the point of LADO of the deal, the probability of 
that was 0.5. 

Success with the heart lead АЗЕ" (1) that East hold the 9 of 
hearts, (2) that West hold the Queen of clubs, and (3) that West hold 
exactly one of the other East-West clubs. As of the point of completion 
of the deal, the probability of that was approximately 0.008. 

If one follows the usual folklore of bridge, relying solely upon a 
priori probabilities, he would have to agree with Mr. Rapee that the 
heart lead was “the mathematically inferior play” in this case. But this 
judgment completely disregards the bidding and 8 tricks of play. And 
the developments düring play have greatly changed the probabilities 
in question. 

We now show how the probabilities should be appraised at the 
beginning of trick 9. 

The heart lead will succeed if West’s original holding were hs; the 
club lead will succeed if West’s original holding were either hı or Лз. 
There has been nothing in the East-West play thus far, it appears to us, 
and we assume that South had the same impression, that has any rele- 
vance to the East-West heart-club holdings, except West’s play of the 
10 J of hearts to tricks 2 and 7. South considered it “unlikely,” the 
news item states, that his particular opponent in this case would have 
“false-carded” by playing 10 and then Jack from a holding of J 10 9. 
To make it concrete, we assume that West would have done so no more 
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than 1 time in every 5. We take it as certain that he would not have 
played 10 J, retaining the 5, from the holding J 10 5. 

On these assumptions, as the table shows, the probability that the 
East-West plays have been made from the holding hs is 0.238. Тһе 
probability that they have been made from either hı or hz is 0.048. 
On that reckoning, the heart lead is not “the mathematically inferior 
play.” It has, it is true, less than a $ chance of succeeding, but the 
club lead has less than 1/20. 

The probabilities of success as ordinarily computed a priori as of 
completion of the deal, and as computed by the method here outlined, 
as of the point of South’s decision, compare as follows: 


a priori revised 
Heart lead 0.008 0.238 


Club lead 0.500 0.048 


CONCLUDING REMARKS 


Many additional examples could be given. Books and newspaper 
columns analyze hundreds of problems of probability arising during the 
play of bridge. We have never noted a case in which the analysis has 
been theoretically correct, and find that the answers given are often 
misleading. Theoretically any "problem of probability arising during 
the play of bridge could be measured correctly by the Bayes formula, 
and only by the Bayes formula. 

The application of the Bayes formula to typical problems arising in 
bridge may seem too complicated to have practical value in actual play. 
We note, however: (1) that an understanding of the principles of in- 
verse probability will frequently help a player make a quick and rough 
judgment as to which alternative is superior, and (2) that one can 
work out once and for all the probabilities associated with certain 
typical situations which commonly arise. Concerning the second point, 
no player would stop the game to compute the a priori probability that 
6 unplaced clubs would be split 3-3, but he can remember that it is 
less than 3. 

The main difficulty with the analysis is that of making realistic 
estimates of the conditional probabilities (S, Н»). But a good bridge 
player can form some shrewd judgments conéerning these conditional 
probabilities by observing the habits of his opponents, and of bridge 
players generally. Thus East not having a bid against a no trump 


2 We mean no disrespect to the bridge experts. Many of them are brilliant men, and most of th 
know more about bridge than we do. We object only to the plain fact that the “probabilities” which 
they universally use do not, in general, measure the probability of success with a given line of play. 
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contract, West/s opening lead very commonly is the fourth highest 
card in his best suit; against other contracts he may avoid leading away 
from a King, if he leads a Queen in an unbid suit he probably holds 
the Q J 10, and so on. Unless one takes account of these habits, con- 
ventions, or strategies, he can not estimate which alternative play is 
superior. The commonly-quoted a priori probabilities usually are not 
very helpful, and often are downright misleading. 
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PROBABILITIES OF CERTAIN SOLITAIRE 
CARD GAMES 


Ковент E. GREENWOOD 
University of Tezas 


I. INTRODUCTION 


INCE the days of the Extra Sensory Perception (ESP) experiments, 
S mathematicians and statisticians have been especially interested 
in “matching” card problems. In this note, two types of solitaire card 
games involving matches will be considered in which the ordinary 52 
card deck is used. For convenience the two card games will be called 
games А and B. (It will be recalled that the ESP experiments used 5 
Sequences of 5 cards each, or & 25 card deck, A tabulation of matching 
probabilities for such a 25 card deck is given by Greville [4]. 

In game А with 52 cards, a player plays out the cards one at a time 
against, another deck (or against a standard order) and scores a success 
if he can go through the deck without getting а match in suit and se- 
quence. In game B the suit feature is disregarded and the deck is con- 
Sidered as a multiple set of four sequences of 13 cards each. The player 
Scores a success if he can go thfough this multiple deck without getting 
а match (sequence match only). 

Game A is a special case of the classical "probléme des rencontres," 
and has been dealt with by many authors. See Huff [5] and the ap- 
pended editorial note which gives additional references, or Feller [2], 
pages 62-63 and 66-69. The probability of obtaining exactly k matches 
is given approximately by 


2 


(1) P, e 1/е- 1, E019 2 18.7. 


Е 

Game В has.been considered by Battin [1] and Kaplansky [6]. 
Special generating functions to give the probabilities of the various 
numbers of matches or hits have been developed. Kaplansky (6, page 
911], notes that asymptotic approximations are available for large decks, 
but states that such approximations are too crude for game B with 
52 cards. For a fourfold deck with a large number of cards (n>>52), 
the probability of no matches is approximately 1/e!, 

Joseph A. Greenwood [3] investigated the variance of certain card- 
matching processes, and from a general formula given by him the 
variances of the number of matches in games A and B are found to be 


(2) cà! = lj ов? = 64/17. ° 
88 
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The means of the number of matches are well-known, 
(3) па = 15 ив = 4. 

П. PROBABILITY OF А WIN FOR GAME В . v 


To calculate the probability Po for a win (i.e. no matches) for game 
B, one may proceed as follows. Let S be defined by the relation 


(4) S-nudze::cezs, 
where 2; is associated with the ith card in some convenient numbering 
of the 13 cards in a sequence. Then a draw or a play from (S—z;) 


could be interpreted as signifying that the ith card was not played but 
that the other twelve possible cards all had equal chances of being 


played. 
The product 
13 
(5) II (S – =) 


i=l 
could be interpreted as signifying that for four specified plays no cards 
marked 1 were played, for four other specified plays no cards marked 2 
were played, ete. Nothing in the above signifies that four cards of 
each category must be played in the 52 plays. However, if one were 
select the coefficient of (2122 - - • 213)“ in the expansion of form (5), 
one does get such an assurance. Indeed, this coefficient is the number 
of ways in which a 52 card deck (considered as a multiple deck of four 
13 card sequences) can be played out so as to give no matches against 
a standard order. The division by the total number of ways of playing 
out such а deck, 52!/(4!)", will give the desired probability Po. Hence 
the coefficient of the term (7122 - - - 713)‘ in the expansion of the form 


K(a, їз, КЎ Шу 21), 
(41)1 18 


o Ta He ; 
-61 [s- -SZ 4s) +0 (Z en X 16ra) — - | 


is the probability Po. The numerical coefficients in the above expansion 
are obtainable in terms of multinomial and binomial coefficients. 

If one identifies A; as the absolute value of the coefficient of the 
term involving S*~ in expansion (6), then 


(7) Рь= Ap — А, + Ae — +++ + Ав. 
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In addition to the probability for no matches (called Р, above) the 
probabilities for 1, 2, 3, - - - , matches are of some interest. Rather 
than develop these explicitly, it is convenient to refer to а procedural 
metho of Kaplansky, who formulated an algebraic symbolism and 
applied combinatorial methods thereto in order to obtain an analogue 
to Poincaré's formula in probability. The adaptation of Kaplansky's 
result [6] to the problem at hand gives 


k+1 5-2 
Pym As ( k ША k ү ы 


+ (2а 
k 52. 
This is just Poincare’s formula with a different interpretation of the 
symbolism. 


The numerical values (4;] may be computed by direct but tiresome 
methods. One gets 


Ao = 1; Аі = 4; 
^7 [asa C) 


ыс 00 eae 
As = 13004/1275. 


(8) 


(9) 


The denominators tend to get larger; to avoid changing denominators, 
a table of values of CA; where C was chosen so as to give integral values 
to CA; is given in Table 1. A decimal approximation to А; is also 
given. The value of С was chosen to be’ 


do С = 5209 = 52.51.50... 38 
E = 5,860,187,598,276,457,205,760,000 
í 5 the least common denominator required was not quite that 
arge). 
The estimated values for Ax, ++ 
ratios of successive A, values, i= 
ence table, and extrapolating. 


Using truncated forms of 
the values Pj, k=0, 1,2, . . 


* ; А were obtained by computing 
0, 1, - - - , 15, constructing a differ- 


relations (7) and (8) one may now estimate 
* , Таз given below in Table 2. Using a rule 


i 
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TABLE 1 

$ = CA; Decimal approximation to Ay 

0 5 8601 8759 8276 4572 0576 0000 1 

1 23 4407 5039 3105 8288 2304 0000 4 % 

2 46 1920 6695 1120 3007 3952 0000 7.8828 5294 

3 59 7693 1727 6852 5878 4604 1600 10.1992 1569 

4 57 1075 3987 0269 2981 1210 2400 9.7450 0200 

5 42 9610 1417 7847 3070 1987 8400 7.8809 9640 

6 26 4959 3716 6818 1148 6146 5600 4.5218 4624 
7 18 7742 2222 3641 6652 5747 2000 2.8504 7462 

8 6 1590 4477 5471 4063 5549 6960 1.0509 9789 

9 2 4053 2409 1387 4685 5882 7520 +4104 5172 

10 8303 3047 3254 8076 8716 8000 -1416 9111 

11 2558 1494 8800 1913 1236 3520 .0436 5303.1 

12 708 9146 7824 0756 1101 4240 .0120 97138 

18 177 8606 3073 7064 4982 8864 .0030 3506 72 

14 40 6209 3763 9221 2378 4102 .0006 9316 787 

15 8 4841 4962 3331 6627 2512 .0001 4477 608 

16 .0000 2776 (estimated) 
17 20000 0490 (estimated) 
18 «0000 0080 (estimated) 
19 .0000 0012 (estimated) 


of thumb method, it is possible to get an estimate of the size of the 
error. Note that А;> А; for 2-8, 4, -- -, 14. The series (8) is an 
alternating series; arithmetical calculations suggest that for each k 
in the range above there is an integer т such that after the term involv- 
ing Ав» the series is an alternating decreasing series. One may then 
estimate the size of the error as being less than the size of the next 
(rejected) term. Thus - 


20 ; 
(11) | Estimated error in P,| € (е і-0,1,2,...,7. 
` 
TABLE 2 
(ABNA E TEPI ETE UC ыу TER IIIS ЕВЕ 
Estimated proba- ДЕ Gram-Charlier 
bilities for game B т Error probabilities 
Po .016233 4-.00000002 .016161 
Pi .068899 — .0000004 .068953 
Р, .14416 + .000004 .144370 
Р, „19821 —.00008 198240 
P, .2013 +.0001 .201113 ^, 
Ps .1613 — .0003 .160890 
Р, .1052 +.0008 ‚105728 
Р; .060 —.0015 .058665 
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Assuming А» = 0.0000 0002, the sizes of the estimated errors may be 
computed from relation (11). These values are also tabulated in Table 
2. It must be stressed that these estimates are only useful heuristic 
guides, i.e. no claim is made that the absolute error satisfies relation 
(11). 

As is well known, game А may be approximately described by the 
Poisson distribution, relation (1). For game B, however, the mean and 
the variance of the distribution of hits are not equal (being 4 and 64/17 
respectively). Rather than choose the Poisson distribution (for which 
the mean and the variance are equal) it may be advisable to consider 
some distribution which does not require equality of these statistics, 
for example the Gram-Charlier series of type Вл 

1f f(k) is the Poisson distribution entry, then the first two non-zero 
terms of the Gram-Charlier series [7] are 


(12) g(k) = ХЕ) + (1/2)(Variance — mean) A?f(k), 


Values of g(k) have been calculated and are also given in Table 2. The 
rather close agreement, between the Gram-Charlier series and the es- 
timated probabilities from relation (8) is certainly remarkable. 

The numerical calculations were either checked by another method 
or repeated by the same method. The results stated are believed to be 
accurate, but due to the extreme tediousness of this work no claim for 
error-free results is made. 


IH. EXPERIMENTAL TEST 


In order to observe the theory in practice, 1000 games of solitaire 
were played, the number of games being fixed in advance. For each 
game, the cards were thoroughly shuffled (at least 6 interlacing shuffles 
and 2 mixed cuts using three or more stacks), and played out for all 52 
cards. Two types of scores were kept, one as if game A were being 
played and the other as if it were game B. For game A, it is possible 
to compare the distribution of hits with a theoretical distribution based 
on relations (1), and the variance of the hit distribution with (2). For 
game B a limited comparison is possible based on the partial distribu- 


— 
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GAME А 


Number of hits 


Observed games 359 884 174 66 16 


Expected mean of hit distribution =1, observed mean =0.999. Expected variance of hit distribution =1, 
observed variance (about expected mean) =0.957. 


GAME B 


Number of hits 


Expected games 
~ (fromrelation8) | 16 | 69 | 144 | 198 | 201 | 160 | 105 | 60 | — | — | — 
Observed games 14 | 56 | 174 | 194 | 210 | 153 | 108 | 55 | 22 8 6 0 
Expected games 
Gram-Charlier 


198 | 201 | 161 


Expected mean of hit distribution =4, observed mean «3.032. Expected variance of hit distribution 
704/17 23.765 approximately, observed variance (about expected mean) 3.400. 


It will be noted that the deviations from the expected values are for 
the most part small and reasonable. The close agreement between the 
values computed from relation (8) and the Gram-Charlier values Has 
already been commented on. 
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NON-LINEAR FUNCTIONAL RELATIONSHIP BETWEEN 
TWO VARIABLES WHEN ONE VARIABLE IS 
CONTROLLED 


R. C. Geary 
Central Statistics Office, Dublin 

Joseph Berkson has indicated that when the error of one of 
two variates is “controlled” in a sense which he defines, and 
when the true relation between the variates is linear, the classi- 
са] regression formula can validly be used for the unbiased 
estimation of the coefficients, In the present paper, Berkson's 
results are discussed and a theory of estimation and of tests of 
“significance is sketched for the case of controlled experiments 
when the inherent relation between two variables is non-linear, 
The results are applied to the study of a constructed example. 


О8ЕРН BERKSON [1] has made a significant contribution to regression 
J theory апа practice in isolating a case in which classical regression 
theory can validly be used to express а functional relationship between 
two variables, both subject to error. Berkson's viewpoint justifies the 
use of linear regression theory without qualification in the typical 
scientific experiment, in which conditions are partially subject to con- 
trol. This viewpoint may have been known implicitly or explicitly to 
many statisticians. The writer was not aware of it, though he has 
worked on the theory of telationship between statistical variables. That 
other students may be in the same situation may justify the present 
further consideration of Berkson’s results. - 

Berkson's main point is that in the case of the “independent vari- 
able” it is possible to distinguish between what he terms “controlled” 
and “uncontrolled” measurements. The objects of the present contribu- 


appropriate to controlled observations, It is shown that, while the 
В. А. Fisher [2] theories of tests of significance apply unchanged in the 
linear case, they do not apply in the case of non-linear inherent rela- 


` It may be useful, at the outset, to explain briefly the difference be- 
tween uncontrolled and controlled measurements by reference to a 
particular example. An operator is making controlled observations when 
he is measuring out quantities z of a drug fixed in advance, say 1 gm, 
2 gms, 3 gms, etc., and measuring some reaction v. In making these 
measurements, however, he makes errors: when he thinks he is measur- 
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ing 1 gm, he has really measured 0.95 gm, 1.13 gm, etc.; when he 
thinks he has measured 2 gms, he has really measured 1.81 gm, 2.05 
gm, ete, Assuming a linear relationship between the nominal measure- 
ment z and the actual reaction v, it is clear that there are two sofirces of 
error in v: (1) that due solely to the error in measuring z, and (ii) that in 
measuring v itself, the actual recorded value being y. The operator's 
recorded pairs of observations will, of eourse, be (z, y) where z is 1 
gm, 2 gms, 3 gms, etc. In this example, if the inherent relation is 
linear, i.e. if the relationship between the true values u and v, of x and у 
respectively, is v=a-+6u, then Berkson has shown that the classical 
regression formula of y on 2 will give statistically consistent estimates 
of the coefficients o and 8, making the usual assumption of randomness 
in regard to the errors of observation in the measurements of the two 
variables. This conclusion extends to the many-variable case as well, 
when the relation is linear. 

In the case of uncontrolled measurements, the operator is not trying to 
work to any preassigned schedule of measurements of the “inde- 
pendent” variable. He simply measures z and the reaction y and asso- 
ciates one with the other. In this case the classical regression approach 
gives biased estimates of the coefficient В even if the number of pairs 
of observations is indefinitely great, forereasons which are well-known: 
in non-mathematical terms the regression of “y on 2” means that one 
is associating the js with the 2/5 strung out in order. In ordering the 
26, however, one is necessarily taking into account the errors in the 
z's as well as the “hard core” which is the correct measurement. In 
simple terms, the 2’s to the right of the scale will contain large positive 
errors and those on the left large negative errors. This is why the re- 
gression of y on z is “too 88%” in the z-direction and, of course, if one 
orders according to y, too flat in'the y-direction in the case of regression 
of z on y: hence the two regressions. While in Berkson's case of con- 
trolled experiment and in the case of uncontrolled experiment, which 
occurs when neither variable can be regarded as the “independent” 
variable (case II of Table 1), both regressions cannot be valid—in 
Berkson’s case one only, and in case II neither, is valid—Frisch [3] and 
Reiersgl [7] have shown that in the case of uncontrolled experimentation 
both regression equations can be used to derive useful information 
about the true values of coefficients. In the case of two variables in 
fact, it can easily be shown that the true (functional) coefficient 8 
must lie between the two regression coefficients. Except when the 
number of pairs of observations is very large and especially when the 
variables are closely correlated (when the regression lines lie close to- 
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gether) it is usually as good a procedure as any to take a simple average 
of the two regression coefficients as the estimate of the true value since 
this estimate is likely to differ from the true value by less than the error 
standard deviation from the methods referred to for case II in column 3 
of Table 1. While, therefore, the answer to Berkson's rhetorical ques- 
tion [1]: “Are there two regressions?” is “No,” it must be added that in 
many cases valuable information can be derived from the two regression 
coefficients. 

Table 1 shows the appropriate solution applicable to the commonest 
cases when the inherent relation is linear. The actual observations are 
always z and y where 


z=ute 
(1) у=о+/ 
v = а + Ви, 


е and f are the respective random errors, u and v the respective (un- 
known) true observations which are presumed to be connected by an 
exact linear relation. The statistical problem is the estimation of the 
coefficients a and В. 


(TABLE 1 


Я INHERENT RELATION LINEAR 
€ esc ШЕ Me EMI Eu S i. 1. 
Case Conditions Solution 
ese Ns Sid АБУ EAS мүз) E a 
I. Classical ге- | е=0, f statisti- | Least squares, strictly applicable only 


gression cally independ- | when f normally distributed, mean zero 
ent of u —z and | and variance the same for all values of 2, 
v when tHe R. A. Fisher [2] significance 
test theory for coefficients and linearity 
also applies. ~ 
п. Both variables | eand f statisti- | (i) Wald [8]: classification of observa- 
subject to un- | cally independ- tions into two groups 
controlled эгтог | ent of one an- | (ii) Reiersgl [6]: used third variable 2 
other and of u related to z and y 
and v (ii Geary [4]: used semi-invariants of 


twodimensions of total degree higher 

than ашы only for non- 

normal distribution i 

ШІ. Both variables | e and f statisti- | Berkson [1]: bowel that SAR re- 

[ subject to er- cally independ- | gression formula for y on z applied, 
ror, that of z | ent of one an- Subject to special conditions specified 
controlled other and of z | above for I. Fisher significance tests 

and v apply. > 


rendum Nee M Ra ais. ost ce 


| 
| 
| 
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It will be useful for what follows to reproduce what is, in effect, 
Berkson's algebra. From (1), 


(2) y = et Be — e) t ff. i 


Let the number of z-grades be m. To simplify the algebra I shall as- 
sume that, since the scale of т, the presumed observation, is under the 
control of the experimenter, he is working to such a scale that 


Dr = 0, 5--1,2,..., 
i=l 


and further that the “experiment” consists in making m pairs of actual 
observations (2%, y;). There may be many replications of the experi- 
ment. Then, denoting by E the average of an indefinitely large number 
of replications, and from (2), assuming that in each z-grade Ee and Ef 
are both zero, 


25 By: = а Y 2% = moy, k5:0,1,2, 8, «50; 


4-1 $ 


3 
(3) 2j s у, = BÈ ae = Mur, %-1,9,%.:, 


where pa is the 2kth moment of the т the odd moments being now 
zero. The simplest solution іп (o, В) is H 


та = 2 Eyi 


S 4 тёш = У т Еу, 


Consistent estimates а and b of а and В respectively are given by 


e m 
таш = У) 22%, k = 0, 1,2,8, t, 
i=l 
5 = e 
( ) mbu = э? TL k= Tj t 


4 


where jj; is the mean of the observations y in the ith z-grade; j;— y; if 
there are no replications. The classical regression estimates will be 
obtained fron (5) by taking the lowest values of k. It may be worth 
observing that, while all values of k will yield consistent estimates of the 
coefficients, we have no hesitation in favouring the classical regression, 
or simplest solutions, on grounds of efficency, when the variance of ӯ: 
is the same in each f the z-grades, for then the regression solutions 
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are the maximum likelihood solutions (when e and f are normally dis- 
tributed) and so have minimum variance. 

Ав regards tests of signifieance and the establishment of confidence 
limits of the estimates a and b, it is obvious from (2) that the В. A. 
Fisher [2] theory, with the usual assumptions, applies literally. The A 
error їп the ordinate is, however, now (f— Ве) instead of f as in the case | 
of the Fisher theory. The basic assumptions are satisfied, however, 
namely that the error variance can be assumed to be the same and the 
errors normally distributed for the different values of =. Since f and e 
are statistically independent, the (unknown) error variance will be 
сл-- ог’ and the significance test will be less sensitive and the range 
of the confidence limits wider than in the Fisher case. In this regard 
the considerable effect of the coefficient 8 will be noted. 


THE NON-LINEAR CASE 


The sampling theory appropriate to the non-linear case of controlled | 
experiments is, however, fundamentally different from the Fisher 
theory which applies when the error e is zero and, as Fisher has shown, | 
the appropriate tests of significance are basically identical with those 
used in the linear case. 1% will suffice to discuss the estimation of co- | 
efficients for the case of the polynomial of the third degree. The alge- | 
bra is extremely simple and the extension to higher degree poly- 
nomials will present no special difficulty. 4 

As before, the presumed measure of the abscissa is x but the real | 
measure is и, the error being e, во that z — ие, and the actual measure 3 


of the ordinate is y with error Л, so that y=v+f. The exact relation | 
between u and v is У 


(6) -v = а + But yu? 4 sus, 


The problem is to estimate the coefficients 2, B, y, and 6 and to discuss 4 
their sampling errors. We assume that Ee, Ee, and Ef are all zero for _ 


each 2; that the unknown variance €;— Ёё is the same for all the m | 
values of z; and that 


e 


In the linear case a sampling theory can be derived from a single ob- 
servation y for each z. According to the method developed in the 
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present paper replications are required in the non-linear case. We 
assume, for simplicity, that the number of observations i in each of the 
mz-grades is the same, ie. that the whole experiment is repeated n 
times. From (6) 


(8) y; = a+ B(z; — e) + (a; е)? d-9(2; — e)? + fi, 
а 


Multiplying (8) across by 21, ть 22, and тё in succession, averaging 
in each z-grade and summing for all grades, we find 


vo = £d үш 
LJ 
(9) vu Uus + бщ 
Yn = m + ущ 
n = тщ + б 
where 
i : п 
РМ: 95 2 Еу; \ 
(10) seit 
=at те 
7 = В 396, 


€» being the variance of the error e. 
From (9) we can readily find y and ô as follows: 


1 | 
(ш — ш?)у = Yn — ша = EA 2% (2 — wa) Ey 
(11) : р 
; 4 1 
(шш — ш?)д = шоп — шп = pn 22 (ит? — шт)... 
€ 


It will be noted that, while £ and 7 can also be found from (9), а 
and 8 cannot, without the intrusion of the nuisance ‘parameter e». 
Actually the estimates of y and 6 are equivalent to the least ы 
estimates: those of a and В are not, except in the trivial case of = 
We are usually interested only in the formula for à in order to dra 


1 Theoretically it is possible to conceive of а model whereby, with a single y-observation in each 
grade, maximum likelihood estimates of the error frequency parameters, together with the estimates of 
the coefficients could be derived. Assume, e.g., that the independent errors e and f are normally dis- 
tributed with means zero and that the error distributions are the same in each z-grade, and that the 
number of grades (m) is not leas than all the parameters to be estimated. However, even in the simplest 
non-linear case, i.e. when the eoefficient à in (6) is zero, the five maximum likelihood equations (i.e, those 
for the estimation of øe, оу, а, В, y) ate extremely complicated and impracticable of application. 
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that it is not significantly different from zero. If the estimate of à, 
according to the sampling theory developed below, is not significantly 
different from zero, then the universal variance of y in the ith z-grade 
will be’from (8), 


(12) (8 + 2ух;)?е + у?е + фз 


where є is the fourth moment of the error e and фз is the variance of 
error f. (m—1) estimates of еҙ can accordingly be found by equating 
the differences of the estimated variances of the sample values of 1/8 
in consecutive z-grades to the differences derived from (12), using, of 
course, the estimates b and c of В and y, as follows: 


єн (b + 22:1)? — (b + 2cx;)?} = var yea — var ys, 
(18) і-2,8,...,т 


In practice the (m—1) estimates es; of е; will, of course, differ: the 
simplest procedure may be to take the simple average of the (m — 1) 
estimates as the definitive estimate е: of es. It is, of course, evident that 
this procedure for the estimation of еҙ is designed only to furnish a 
consistent estimate. It is not equivalent to the maximum likelihood 
solution and is probably not „the most efficient. With 6=0, the es- 
timates a, b, c of a, В and y are then immediately obtained from (9). 


SAMPLING ASPECTS FOR THE NON-LINEAR 'CASE 


It may be remarked at the outset that the approach in this paper 
to tests of significance in the non-linear case is essentially approxima- 
tive. It consists simply in making estimates of the variances and using 
these estimates as tests: this test becomes exact only when the number 
of replications is indefinably large. As is well-known, the Fisher test in 
the non-linear case is exact, under certain conditions, even when there 
is only 8 single series of y-observations, i.e. one replication of the ex- 
periment. 

From (11), the estimates c and 


d of y and ô are found by substituting 
sample means ӯ; for Ey; and t 


he sampling variances are given by 


1 
(14) (ш- DL E GM ауз Е 
| ш 2)? var c Sus MG ш)? var y; 
(45) (иша — p42)? var d = —— ша — шх)? var y; 
т?п i з 


where n is the number of replications of the whole experiment. If ô is 


— ae 
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not significantly different from zero, from (9) 


T 
(16) ш? varb = —— У) zi? var ys 
mn | e 


As to the estimate a of а, its variance, given єз, is, from (9) and (10), 
(17) vara = var 9 — 2(us + в) covar (9, c) + (ш + е)? var с, 
with 

(18) covar (J, с) = >) (z? — m) var у/т?п(щ — ш). 


STUDY OF А CONSTRUCTED EXAMPLE e 


То see how the foregoing procedures would work in practice, an arti- 
ficial example has been constructed on the following lines. Let the 
inherent relation be 


(19) о= 3 + 20и – и? 


and let there be 5 (=m) values of т, namely —6, —3, 0, +8, +6. 
There were 10 (=n) replications of the whole experiment. The ob- 
served values of z, deemed to be these five values, are subject to errors 
which are assumed to be unbiased. The ordinates are also subject to 
random error. Actually the “induced” errors were normally distributed 
with mean zero and variance unity and were in fact those (Ж1/10) 
given by E. S. Pearson [5, p. 349] for the normal population, the 50 
values in the first column of Pearson's table being regarded as the 
errors e in z and the 50 values in the second column being those (f) 
for y. Thus the actual values in the case of the second observation 
(ie. x= —3) of the first of ten experiments were not the true values 
z= —3 and »=34+2X—3—(-8)?=—12 but u= —34-1.0— —2.0 and 
у=8+2Х —2.0— (2.0)! —0.4 — —5.40. Тһе 50 actual values of y as 
"observed" are given in the table, with computed means and variances. 
The theoretical values, which would, of course, be unknown in actual 
experimentation, are also given as well as two theoretical series of 
values of the means, one biased due to the error term — (e)? the 
mean of which is unity, and the other with this term eliminated. The 
sample average and variance for х= —6 goes far astray because of the 
abnormally high entry at experiment X due in turn to the fact that in 
this case the z-error was 3.0. One would probably decide to eliminate 
or to adjust this value by reference to the other nine values in the 
column: though this is always а hazardous procedure if bias is to be 
avoided. If this discfepant value were eliminated, the resulting (sample) 
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mean would be 44.22 and the variance 177.65. For the following com- 
putations the original values have been retained. 

The first fact to note is the great variability of the variance from 
grade vo grade, which would make the assumption of equal variance, 
necessary for the application of the Fisher sampling theory, quite 
unreal. The value of d, the estimate of ô, the coefficient of u’, is, using 
(11), 0.0127. As the standard deviation, estimated from (15), is 0.0292, 
we infer that 6 is not significantly different from zero. The estimates 
of y and 8+ their standard deviations estimated from (14) and (16) 
are 


с = — 1.03 + 0.12 
b= 244-049 


80 that the sample values do not differ significantly from the true values 
ті and +2 respectively. 


(20) 


TABLE 2 
CONSTRUCTED EXAMPLE: INHERENT RELATION QUADRATIC 


| Observed values of y for z= 
Experiment 


Be3ESEZLEREIE 


4 
1 
5 
4. 
3. 
4 
2 
2. 
3 
5 


_ Experiments (biased) 
- Theoretical* 
| Biased 
Unbiased 
Variance 
Experiments (9 d.f. 
Theoretical* à Yes à E -88 | 12.40 
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As to the estimate of о, it is first necessary to estimate the variance 
“еҙ of the error of observation e of х. Using (13), with the sample 
values for b and c, the four estimates of еҙ (i.e. ез, ез, ем, and еш) are 
respectively 2.48, 1.10, 0.81, and 0.80, the mean of which is 1.8, which 
is taken as the estimate of e»: the actual value, as we know, is 1. Using 
(9) and (10) 
Mean y = a + с(е + m) 
or —15.77 = a — 1.03(1.3 + 18) 
giving a=4.11, the actual value being 3. The variance estimate, com- 
puted from (17), using sample values, is 1.91, so that the standard devi- 


ation is 1.4. The error of estimate of a, namely 1.11, is accordingly less 
than the estimated standard deviation. 


(21) 
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COMBINATION OF NEIGHBORING CELLS 
IN CONTINGENCY TABLES 


а C. С. Craig 
University of Michigan 


In the application of the x*-test for independence to con- 
tingency tables in case the expected frequencies in some of the 
cases are small, a commonly recommended procedure is to 

x coalesce two or more rows and/or columns. Based on Cramér's 
modified x* minimum method, this paper develops means of 
combining individual cells instead of whole rows or columns. 
Specific results are given for a pair of cells in the same row (or 
column), а block of cells, two pairs which form a 2X2 block, 
and two row pairs whose columns do not overlap. Results 
generally are simple if the sets of cells combined form а block 
or blocks which do not overlap. The complications that arise in 
the case of partial overlapping are illustrated for two row pairs 
which have a single column in common, The estimates of the 
probability of an item falling in each of the three columns on 
the null hypothesis are obtained from & quadratic equation 
which, however, always leads to а unique set of consistent esti- 
mates. 


О of the perennial difficulties encountered by persons who try to 
follow the directions given in the numerous books on statistical 
methods in their own research is how to proceed with contingency ta- 
bles larger than 2X2 in which the expected frequencies in some of the 
cells are below the deadline of 5 or 10 (depending on the book). Fre- 
quently a worker is very reluctant to make the frequently recom- 


frequencies. But it may be useful to mention that the exact test avail- 
able for the 2X2 table in such cases can be extended to larger tables if 


x? їп the obvious mechanical way, reducing the degrees of freedom by 
one. It is no longer clear that the x? obtained in this way has (asymp- 


104 
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totically) a x?-distribution with the degrees of freedom found in this 
way. This set me to thinking about the possibility of a correct pro- 
cedure of this kind developed from an acceptable mathematical model. 

The results which follow were derived by what Cramér hfs called 
the modified x? minimum method. For full details and in particular 
for a proof that a x? obtained in this way has asymptotically a x? dis- 
tribution, the reader is referred to Cramér's book [4]. Briefly, for an 
r Xs contingency table, under the null hypothesis of no association or 
of independence, the probability p;; that a randomly chosen individual 
belongs simultaneously in the ith row and jth colümn of the table 
(-1,2,-.:,";і-1,2,..., s) is the product of the probabilities 
pi. and p.; that an individual falls into the ith row and the jth column 
respectively. Since 


т D 


Ур = 22 p.171, 
il 


i=l 


there ате r+s—2 unknown parameters to be estimated from the data 
before the expected cell frequencies can be found from which a x? can 
be computed. As Cramér remarks the modified x? minimum method is 
equivalent to the use of the principle of maximum likelihood for the 
estimation of the p;. and p.;. If the observed frequency in the cell at 
the intersection of the ith row and the jth column is уд, then the prob- 
ability of occurrence of the observed cell frequencies on the null hy- 
pothesis is 


а) C(pi-p.1)1(pi-p.3)' + + + (р-р). 
2 = Cp.” Tt Prpa purs 
in which М 
Я О г 
Е > Р, йат » Уй, 
ігі i=l 


T з 
>.» em Y»; = п, 
i=l іі 


and C depends on the »;/в but not on the p;.'s nor on the р. в. In this 
case the method of maximum likelihood consists of determining the 
values ;. and $.; of p;. and p.; respectively for which the expression 
(1) is maximized for fixed values of the marginal totals, v;. and v.;, and 
then using those values as the sought estimates of the p;. and p.;. The 
equations which have the $;. and }.; as their solutions are 
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x(m-)-A-7-e @=1,+++,r—1) 
(2) ii \ Pi: Pr. Di. Dr 
zm)... ЧЕ вт 
+1 ND Da Pi Dua 
which give the usual estimates 
Ye Vig 
ĝi. = x D > 


To apply this method to the case in which a pair of neighboring cells 
have been combined, we assume that s>2 and that the cells combined 
are the first two іп the first row. That is, vır =w is known but not 
Уп and.» separately. The null hypothesis to be tested becomes: 
E(vi)) =прг.р.1їогі=1, +++, 7,7=8, - - -,s, and E(w) = при. (р-р... 
Now the modified x? minimum or maximum likelihood equations cor- 
responding to (2) become for rows 


О $-1,5:,rf-1) 
Di. p. ? ( D , 
as before, while for columns thoy are 
а 
iss is Me e dM 
ра opa Pop > 
г 
(3) ра + pa ?з Da — ^ 
о, ^ (-23.-,-1) 
р.і ра 


Тһе row system solves as before to give 


Vi. ' 
Жаз Wm TU TET s). 
n 


To solve the column system we first multiply the jth equation by 
P-n 171, +++, s, and then add, getting 


Vig 
fem; 


Da 


pee 
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so that 
7.1 4 
Ben G-3;. 9. 


Finally, the first two equations of the column system are solved sepa- 
rately, giving ИЕЛІ 
Ga Еа - >п) 


(4) Bs (>л + r.a — w) ead 
MR (ил + v.) (v.s — nig) у 
а (>л + уз — ш) 5 


With these estimates of the р;. and the p.;, the expected cell fre- 
quencies and x? are calculated in the usual way. The resulting x? has 
(r—1)(s—1)—1 degrees of freedom. The reader may have noted that 
the expression for #.1is the product of the proportion of the total fre- 
quency falling into the first two columns by the proportion of items 
in the first column to the total in the first two columns after the two 
cells that were combined have been deleted. It is obvious that the 
method here developed for the first two cells in the first row applies 
equally well to any two cells lying in the same row or in the same col- 
umn; if it seems simpler, one may first interchange rows and columns 

2 in the table until the cells to be coalesced lie in the position for which 
the formulas have been given. 

As an example, consider the following 4X3 table 


constructed by combining the first two cells in the third row in an ex- 
ample used in Hoel’s textbook [5] for which both expected frequencies 
2 are less than 10 on the null hypothesis of independence. We have 

232 116 52 НУ 
21. a 200” фз. = 200” 7 
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and 
111 176 
$ фз = 200 «i 
as usual. But now, 
а= шы = 0.10757, фа = сша = 0.1750*. 


Then 
2 Е(ғп) = 4005.85. = 24.9, 
etc., for the single cells while 
E(w) = 40043.(f.1 + 5.2) = 14.7. 


Then from the 11 cells, x* 217.9 which for 3.2—1- 5 degrees of free- 
dom lies between the 1 per cent and 0.1 per cent points. 

'This method is readily extended to the case in which the coalesced 
cells form a rectangular block. For example, suppose that only the to- 
tal w—»n--vizd-va- va of the 2X2 square in the upper left corner of 
the rXs table (r=3, s23) is given. One now has maximum likelihood 
equations of the form (2) for both rows and columns but the solution 
proceeds as before, giving, 


i (n. + уз.) (у. — vu — тз) 


^. n(vi. + о. — w) 
8. E^ (n. + 92.) (vs. 5% Zu = >Р») ; 
(э. + v». — w) 
ee (v.a + v.)(v.3 — vu — zm) : 
(5) n(va + v. — w) 
gu (v.1 + v.2)(v.2 — vis — va) ? 
nvr + v.2 — w) 
bea, i25 59 
v.i : 
bo, 6-3, .,9. 


Generally if the first г; rows (ri«r) and the first s; columns (s; «5) 


НАНА — M — n—- атте ен RR BL 


| 
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are combined with a total of w, we have 


S m Р. е 


ан ӛзі 


Pu OI UI EQ (4--1,:..,п) 
п}, Хм 
i=l ісгіні 
r Di 
27 »i2,v4 
(6) jo HE, б-1,-:-,ө) 
n2) У vi . 
derpkl j=l 
Vi. x 
pi. = —, G@=n+1,-++,7) 
т 
v.i 5 
b, -а+ь..., 3). 


The resulting x? has (r—1)(s—1)—risi+1 degrees of freedom. 

There is still no difficulty if two pairs of cells, each pair belonging to 
the same row and both pairs belonging to the same two columns, are 
combined. Suppose the totals v —vu--vi and w=ra-+m§ are known as 
well as the separate entries in the remaining cells. We now get 


2 (ға + р.) (к = vu — va) 


nv. + v.2 — v — w) ы 


Фа 


М (1+ у.) (2:2 — vi — ин) ) 


(7) В n(va + v». — v — w) 
бай (1-8, ‚ 8) 
n e 
d - G 1,0. 
т 


The resulting x? has (r— 1)(s—1) —2 degrees of freedom. 

Likewise things work out simply if in each of two different rows 
& pair of neighboring cells are combined, the two pairs having no col- 
umn in common. Let 2-»ц-Буи and v—7:-L-va. We find 
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Vi. 


beu, О 
_ (wa trara = n) 

фа = n(v.. + уз — v) à 
_ Ga va — Y12) 

ўар n(va + v.a — v) s 


e 2 Wat vale n) 


n(v.s + уа — w) 
y. (v.a + v) (2.4 — >м) : 
n(v.s + v.a — w) 


фа 


фа 


v 


bs G- 5,9. 


There is only a slight modification which by now the reader can 
make for himself in case the two coalesced pairs of adjacent blocks Пе 
in the same row. If y=vy-+»2 and w=73+-714, we have 


be =) @24,-++,” 
i (v.i + 2.2) (2.1 уп) 
; n(va + v.a — v) 
335 (v3 + у.) (v.2 — >ш) i 
тра + v.2 — v) 


(9) 
х i (v.s + v.) (v.s — ris) 


bs n(v.3 + v. — w) ? 
eee (v. + ум) (v. рм) 
$ n(v.s + v.a — w) 
E я = ... 
М эйе, (7 es 5, ,8. 


The degrees of freedom for x? in both the last two cases is (r—1) 
(s—1)—2. і 

There is no difficulty so long as the pairs or rectangular blocks of 
neighboring cells have all of their rows or columns in common or else 
have no rows or columns in common. Аз soon аз there is partial over- 
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3 lapping of rows or columns, the likelihood equations are not во simply | 
| solved. Consider the simplest case of this kind in which = у-у and 

1 —»--»». The likelihood (or modified x?-minimum) equations for 
rows are as simple as ever, giving 


But the equations for columns are 


|] v Pat» Vig 
| БАА 1 ur i 
Pat 22 Pa Pes 
| 2% + ш TA NM вен 
| Dac pas Dad pa 2.2 [m 
ш 7.3 — Veg Vig 
i ПА ИЕ Tr TRO CEDARE 
(10) рз + Da ра Ds 
7.4 Ves 
puch и D: 
De D^ 
e . 
дао ce, 
Pet) 7. 


On multiplying by corresponding р./в and adding, we still get 


v. . 
pu (7-4-::,9). 
т 
To solve the remaining three equations веб 
Da p.a 
с ачылыр MU M ue BASIS ie ipd Cet SES NY A c-1—a-b 
| Pacpactpa ° разра | L ; 
ў vit v.a +v. = т; and 
| vı — vu = Ay, v.a — из — Уз = Аз, ya — vs = Аз. 
| Then on eliminating a from the first and third equations by means of 
4(1- 
(11) ы tud ugs 
27 т(1—с) — 


we have the quadratic for c: 
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mim — w — Arde? — [(m — v)(m — v) — m(Ar — As) 
= АА]. + (mi — v — АА: = 0. 


Tt can be shown by some fairly messy but elementary algebra that the 
two roots of this equation are both real and are positive proper frac- 
tions. Further the larger root cannot be used since then, using (11), we 
get a+c>1. Thus, taking the smaller root of (12), we get a unique set 
of values for a, b, and c which uniquely determine 7.1, p.2, and р.з. The 
degrees of freedom for x? are still (r—1)(s—1) —2. 


(12) 
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APPROXIMATING THE MODE FROM WEIGHTED 
SAMPLE VALUES 


Howarp L. Jones 
Illinois Bell Telephone Company 


The weighted mean of ordered sample observations can be 
used to approximate the mode under favorable conditions, 
where the weights are determined from the first two terms in a 
Taylor expansion of the maximum likelihood estimate. Such 
weights are shown in Table 1 for the case where the sample is 
selected from a t-distribution with known kurtosis, 


е 
INTRODUCTION 


05 of the oldest arid most difficult statistical problems is that of 
deciding how much weight to assign to extreme sample observa- 
tions in trying to estimate the typical or average measurement for 
some population. It may be regarded as a part of the more general 
problem of how to make estimates on the basis of observed sample 
values after such values have been arranged in order of size. 

Sample values thus arranged are now commonly called order statis- 
lics; and estimates computed therefrom, as well as other functions of 
these statistics, have been termed systematic slatistics [4]. Weighted 
sums of sample values arranged in order of size may thérefore be called 
linear systematic statistics [3]. Examples are the sample median, where 
a weight of zero is assigned to each order statistic except for one or two 
central values (depending on whether the sample size is odd or even), 
and the midrange where the first and last order statistic is each weighted 
one-half and every other order statistic is weighted zero. The sample 
mean may also be regarded as a systematic statistic where each order 
statistic is assigned a weight equal to the reciprocal of the sample size; 
although in this case, the equality of the weights makes it immaterial 
whether the sample values are ordered or not. 

The advantages of linear systematic statistics have ‘been discussed 
by various writers. It is known [1, p. 483] that the sample mean is the 
best unbiased regular estimate of the mean of a normal distribution. It 
also tends to have desirable asymptotic properties in large samples 
from most types of distributions that statisticians have studied, But 
even where other types of functions are more efficient in the sense that 
for a given sample size they lead to unbiased estimates with a smaller 
Variance, a linear systematic statistic may have important advantages | 
1n practical problems. Chief among these are the simplicity of the com- 
putations, and the economy of time and expense in situations where 
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samples of any desired size can be selected and measured with relative 
ease [2]. These advantages explain why the range (a systematic sta- 
tistic in which the extreme order statistics are weighted —1 and +1), 
instead of the root-mean-square, is generally used to estimate the 
standard deviation of a distribution in quality control applications. 
Some work has been done on problems of measuring and maximizing 
the efficiency of various linear statistics, particularly for samples from 
a normal distribution [5]. 

А deterrent to the use of these statistics is the difficulty of finding 
satisfactory weights. The general problem was discussed several years 
ago in an unpublished paper, ^The Weighted Mean of Random Ob- 
servations Arranged in Order of Size," which was written jointly with 
John H. Smith and recently reproduced in Ditto form as National 
Bureau of Standards Working Paper SEL-51-1, February 1951. The 
weights which lead to the best unbiased linear estimates of the mean, 
mode, median, or any other location parameter can usually be deter- 
mined when the first and second moments and the product moments 
of all the order statisties are known. Since these moments are difficult 
to evaluate for many typical distributions, a general method was given 
for finding weights that would lead to a first approximation of the mode 
under certain favorable conditions. The present article is the result of 
Churchill Eisenhart/'s desire to have this method published more 
widely, since it has been found to be useful in broader applications. 

In order to illustrate an application of the method, the weights 
shown in Table 1 were computed for various sized samples selected 
from a i-distribution with varying degrees of kurtosis. Hence, if one 
knows that the parent distribution is approximately of the same shape 
as some form of the ¢-distribution, and if the approximate kurtosis of 
this distribution is also known, the mode can be estimated from а sam- 
ple of size 10 or less by arranging the sample values in order of size and 
then applying the weights shown in the table. If the kurtosis is 3, indi- 
cating a normal distribution, the weights are all equal, as they should 
be. If the kurtosis is large, the median is weighted more heavily than 

' the extreme observations. 

While the mode of a distribution is generally of less interest than the 
mean or even the median, there are many distributions like the t-dis- 
tributions where these location parameters coincide. The estimated lo- 

_ cation of the mode may also be useful in other situations where its re- 
lationship to other parameters like the mean or median is known. The 
mode is emphasized here merely because it is important in the pro- 
posed method for developing the weights. 


| 
| 
| 
| 
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TABLE 1 
WEIGHTS FOR APPROXIMATING THE MODE OF A 
t-DISTRIBUTION 
ааа 

Бі Rank 4 

ais B Kurtosis:* оң = p4/ p2? 
Sam- Бат 

ple ple 3 3.5 4 4.5 5 6 9 © 

3 2 .34 .86 .36 .88 .88 .40 .40 .42 


5 3 20 22 .22 24 .24 24 26 28 
2,4 20 21 .22 22 .22 23 23 24 
1,5 20 18 ‚17 16 .16 15 14 12, 
8,4 17 18 19 .20 .20 21 22 28 

6 2,5 17 17 18 18 18 18 18 18 
1,6 16 15 13 12 12 il 10 09 


4,5 13 14 . .15 15  .16 16 17 18 
8 3,6 13 14 4 15 15 15 15 16 
2,7 12 із 0! 19 530891 2 12 12 11 
1,8 12 10 5.09.55: ОВО 07 06 05 

• 
5 12 12 мае 14 16 16 
9 4, 6 11.4..12.— 028 Я 14 15 16 
3,7 11 12 алудан 18 18 18. 
2,8 11 і 10  .10 10 10 09 
1,9 1 09 .07 07  .06  .06 , .04 04 
5,6 10 11-742 7,12 13 13 14 15 
47222510111 УАЗ АЙН 7522 13 13 14 
10:5073:8 7730.7. 1:30 ae HIR LGB ET LB e 
2,9- 10-....3014 209 25) OB) OD 8 1081 7:085. „0 
1,10. .10 '/.08 206 06: .05 . .05. ..04. 7,02 


* The column headings are the values of a4 for the parent distribution, not for the sample. 
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DERIVATION OF THE WEIGHTS 


For some infinite population, assume that the distribution of a meas- 
urement z is described by the probability density function 


(1) у = f(t — б,%,0,:::,0), 


where 6;, 62, - - + , 0, are parameters of the distribution, the form of the 
function being such that 6; is a location parameter. Assume, for every 
admissible value of т where y>0, that the derivative y' =dy/dz exists, 
that y has a unique mode at z —6;, and that y'/y is analytic. Let y; and 
yi respectively, denote the value of y and its first derivative at the 
point z —2,; and let 


@ O(a) = фу, 
(3) $' (xi) = dé(z;)/dz.. 
Assume that for every pair 2; and ту, the ratio 
(4) Ви = $'(2)/$'(т;) 


for б, = 0 is known; that is, the value of Ri; does not involve an unknown 


parameter, Then weights for approximating the mode can be developed 
ав follows. 


Let 5 
(5) f Yr = Ла, — 01, 02, °° 0) 
where 2, is the rth order statistic in a random sample of size n from dis- 


tribution (1), The maximum likelihood estimate of €; when bz, 0з, > · ·, 
6, are known is, in general, a solution of the equation 


9 
и | Revit 
1 


where У) denotes summation over all n values of r. From the juxta- 
positional relationship between 2 and 0) in (5), 


; (7) PX —-0,0,-:.,0) = — XLI — б, б,:::, Ox) 
: = Фа), 
where ф(2,.) is defined by (2). Combining (6) and (7), we obtain 
(8) È olz) = 0. 


Let б, be a value of 6, that satisfies (8). Expanding $(0;) іп the neigh- 
borhood of 2, yields э, 
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(9) Ф(@) = а) + (& — ro'l) + 30 — rola) +--+, 


where $'(z,), $” (£r), . . . denote successive derivatives with respect to 
£r. Neglecting all but the first two terms of the expansion (which pro- 
cedure may introduce considerable error when $:—2. is large), and 
summing over the n observations, we obtain the approximate relation- 
ship, 


(10) пф(@) ~ Z el) + 22 (6-29 а), 
or after combining with (8), 
(11) тФ(6)-- >, (& — z)’ (2). 


е 
Now suppose that =, the mode of distribution (1). (In general, this 
supposition will be true only to a first approximation.) Then from (2), 


(12) Ф) = 0. 
Combining (11) and (12) leads to 

(13) & = wr, 
where қ 

(14) ш, = $()/ > (а). 


For a normal distribution, $'(z,) =1/c? for every 2, hence, w,=1/n 
for this case, which is as it should be. In general, however, ó'(z) varies 
with 2; and in order to make use of (14) to compute a linear estimate 
of the mode, we must choose some value of $'(z,)—say ¢'(#-)—that is 
typical for the order statistic z,. In most applications, perhaps the sim- 
plest choice is the one where X, is the mode of the distribution of z, in 
а sample of size n from a parent distribution of the specified form with 
the mode chosen ав%һе origin so that 6; does not appear in the general 
expression for w,. This value of т, can usually be determined or approxi- 
mated when the form of the parent distribution is known. 

Let F(x) denote the cumulative probability distribution of т (that 
is, the probability that a random selection from distribution (1) will 
be equal to or less than a particular value of z). Then the distribution 
of the order statistic 2, (where z, 31,1) may be written as 


15 ELM LA Vds pos г) |" "yr. 
ой re 


To find #,, the mode of this distribution, we set the derivative equal to 
zero. After dividing by 


е 
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n 
ON а алалық аланы F Ze т-1 1 сыз F 7, nry? 
o pia pi РЕВ Fely, 
we obtain the equation 


r—1 п-т yel 
арамен I 0A) — = 0. 
ІШІ ға) 1-ға) 4 


This equation can generally be solved for 2,, for r—1,2, - - +, n, by an 
iterative procedure. 
COMP"|TING THE WEIGHTS FOR THE Í-DISTRIBUTION 


The probability density function for the t-distribution may be writ- 
ten in the form 


(>) ОЯ 
(17) {= T уд AG ex atoi 


where Ё is the number of degrees of freedom. Since the mode is at 
1—0, we set 0 —0 and obtain the functions 


(E +1) 
18 ipie gud dud 
( ) ; У, E 2.9 


k + 1) 
(19) $) = ыж, 
(k + 1)(k — t?) i 
(k + t2) 
From (14) and (20), the desired weights are seen to be proportional to 
(6—02) 706-73), where f, is a typical value of t. 


Let us first consider the case where k—1. After dividing by т, 
equation (16) for this case becomes equal to 


(20) o't) = 


(21) ыш ут 


то 
am - ао аль | {лт — arc tant, | enia 


where the angles are measured in radians. For n= 5, solving (21) for 
1, 2,-+-, 5 successively by an iterative procedure and employing 
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(20) and (14) to find the corresponding weights, we obtain the follow- 
ing approximations: 


а F T 
r t, Ф) Wr 
1 —.8738 .1521 .03 
2 —.8689 1.3887 .27 
3 .0000 2.0000 .40 
4 .8689 1.3387 .27 
5 .8738 .1521 .03 


The weights shown in the last column indicate that ia a sample con- 

sisting of five observations from this distribution, the sample median 

should receive the greatest weight (.40) and the two extreme terms very 

little weight (.03) in estimating the mode. : 
Suppose we make the transformation 


(22) k=2p, t= 2p —2)/z, 
for (0. Then for z=2,, (20) reduces to 

| 1 
(23) $5) = PE aioe, — 1), 


The weights should therefore be proportional to #,(2#,—1), where %, 
is the value of т corresponding to the mode of the distribution of tr. 
Since p and т have the same meaning here as in Pearson’s Tables of the 
Incomplete Beta Function, we write 
(24) F) =1- Ыф, Ya 
1 dy, 
y? dt, : 
+ where I(p, .5) and B(p, .5) also have the same meaning 88 in Pearson's 
tables. Equation (16) can therefore be written as 


(25) = — (2p + ы, 2 Bp, 5), 


r-1 nis Viu. 
(26) — (2p + D Z Bip 5) = 0, 
И 5 oos % | 
and approximate solutions for r= 1, 2, +++ > т can be found by inter- 


polating in those tables. These 2 indu сап be taken as the values of 
“Tr to be substituted in equation (23) in computing ф'(2,) and wy. | 
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The procedure just described was used in arriving at the weights 
Shown in Table 1. The kurtosis shown in the column headings is re- 
lated to the number of degrees of freedom by the equation, 


(27) o4 = 3(k — 2)/(k — 4), for k z 4. 


COMPUTING THE WEIGHTS FOR A POLYNOMIAL DISTRIBUTION 


The properties of the estimates discussed in this article have not been 
investigated for the general case. Investigation for cases where the dis- 
tribution is а polynomial in z is relatively easy, however, provided the 
sample size and the degree of the polynomial are small. "The following 
example shows what happens in a particular situation of this kind 
where the distribution is skewed. 

Consider the distribution of the simple form 


(28) z = 12u*(1 — и), 0=5и=1, 


which is а special case for the derivative of the Incomplete Beta Func- 
. tion. We first note that the mode is at w= 3. Distributions of this same 
. general form may therefore be obtained by making the transformation 


2 — 0 2 
29 - im Б 
(29) : u а 3 
Since du/dz — 1/6», equation (28) becomes 
12 /z — б 2N/1 2-6 
КЕЕ 
) &N % +243 в /’ 


with 2 satisfying the inequality 
20, 6, › 
(31) a — — 256 + — · 
1 3 = 25:91, + 3 
This distribution has its mode at z—6,, and is of the form prescribed 
in discussing distribution (1). 
After setting 0; =0 and б, = 1, we may proceed as previously suggested 


and write down the relationships! 


4 
(32) yr = x (1 — 32) (2 + 82,)2, 


1 Тһе computations can be carried out without assigning the arbitrary value of 1 to the scale 
parameter, б. The effect will be to multiply the resulting values of ¢'(ž,) by the reciprocal of 012. 


APPROXIMATING THE MODE FROM WEIGHTED SAMPLE VALUES 121 


(33) yr! = — 192,2 + 32;), 
34 (су Jd 272, 
(34) Фа) = — Yr w= ac Иса ы)! y 


27(2 + 92,2) 
(1 — 32))*(2 + 32,)? 
From (14), the desired weights are to be proportional to (35) for r— 1; 
2,:::, n, where z, is a typical value of the rth order statistic. 


Suppose the sample is of size 3. Employing (16), we may find the 
modal values of zı, 2», and хз from the relationship 


$T 8 -—* 243%, 


(35) $'() = 


(36) - - - 0, 
Ға) 1-Ға) 4(1 —3%)2(2 + 3%) 
where 
2 1 
37 —-—s%s—) 
(50) are 3 
and 
(8) ға) Í dis, = L2" 38) за. 
)- ж. = — (2 — 8%, е)» 
ME. 27 А 
We may also write 
(39) 1—F(E)- = (1 — 3%,)?(11 + 18%, + 9,2). 


Putting r=1, 2, and 3 successively in equation (36) and clearing of 
fractions, we obtain the equations 


(40) 29743 + 5943,? + 387%, + 64 = 0, 
(41) 8914 + 118852 — 2735? — 582%, — 20 = 0,0 
(42) 993,? — 66%; + 8 = 0. 


The roots of these equations satisfying (37) have the approximate 
values 


(43) 4, = — .2481851535, 
(44) # = — .0345011424, 
(45) * = .1592556774. 
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Substituting these values for т, in (35), we obtain the approximations 


(46) 9(%) = 14.37744535, 
(47) 9(%) = 12.89545132, 
(48) $'(&3) = 35.9319163. 


Using these values in (14), we find the approximate values of the weights 
to be as follows: 


(49) wi = 229287748, 
(50) wa = .197679424, 
(51) › ws = .573032828. 


In using these weights, we are likely to be especially interested in 
three properties of the resulting estimates of the mode, these proper- 
ties being defined by the following relationships: 


(52) Bias = E[À] — б, 
(58) | Variance = Е [0,2] — (Е[8,])*, 
Mean Square Error = E[(& — 0)] 

= (Bias)? m Yariance, 


where the symbol E denotes average or expected value. For this par- 
ticular case where the distribution is described by equation (30), these 
properties when the weights in (49), (50), and (51) are used to estimate 
the mode are indicated in Table 2 in the column headed **Maximum 
likelihood' estimates." For purposes of comparison, the same proper- 
ties are shown for two other weight systems, the second system being 
that which yields the “best unbiased estimates"—that is, the unbiased 
linear estimates with minimum variance—and the third being the sys- 
tem which yields the linear estimates with minimum mean square er- 

‚ ror. The methods of deriving these weight systems and their proper- 
ties are outlined in the appendix that follows. 

_ It should be kept in mind that the comparison shown in Table 2 ap- 
plies only to distributions of the form described by equation (30), and 
is equal to one, depot uo Рид: үй Aeon Ie гип dep ыы 
implied specification that Өз is positive). Under other conditions, the weights shown in the last two 
columns of Table 2 may not be optimum with respect to the properties indicated in the column headings. 


For example, if 6; is known to satisfy the inequality 
—.000000001 «6; <.000000001 


(54) 


and 6: is known to be large, say 

9, >100,000, 
then the estimates obtained by employing the weights shown іп the last column of the table would not 
be as good as estimates obtained by setting all the weights equal to zero. 
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may not be indicative of the relative merits of different methods of 
estimating the mode for other types of distribution. 

TABLE 2 И 


COMPARISON OF THREE LINEAR SYSTEMATIC ESTIMATES 
OF THE MODE OF A POLYNOMIAL DISTRIBUTION* 


*Maximum к “ ф Minimum mean 
likelihood" Lage ecce Square error 
estimates f mae estimate | 
Weights 
ш .229287748 .282616040 „241695273 
ш ‚197679424 .133288817 .134304834 
_ ws - 573032828 .634095143 .623999893 
Bias — .00920253702 0 — .0032760920; 
Variance .0124648780,* .0124789160,* .0124549140,* 
Mean square error .0125495640,? .0124789160° .0124656470,* 


* The values shown are for the particular polynomial distribution described by equation (80). The 
variance of this distribution is 045,3. 


APPENDIX 
Lower moments of the order statistics 


To investigate the properties of varidus linear systematic estimates 
of the mode of the distribution described by едиа от» (30), we first 
derive the lower moments of the order statistics. Consider the distri- 
bution described by equation (28). For a sample of size three from this 
distribution, the moments and product moments of the order statistics 
may be obtained from the general formula 


uy ч 
y ў f uutuus YYY sdududus 
ug=0 V чү-0 


le 


(55) Е[шмииз*| = 3! 


чз=0 

where 
(56) y; = 12u;'(1 — ш), i= 1,2,8. 
It turns out that қ 

2127 3039 3843 
57 ЕЙ ме сла. Е TETERE 
(87) Efu] ЕЕ Е [и] 5005' [из] 5005 

1028 1948 3030 
(58 pee. КЕ а] сш 

LU 5 ЕР = Бо’ Blut] = gs 

6777 333 477 

69). ж(ша|----,  Ehaw]--— Efuu] = ——- 
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Тһе lower moments of order statistics for samples of size three from the 
distribution described by equation (30) can now be obtained by direct 
substitution of the right side of (29) for шіп (57), (58), and (59), since 
there is a linear relationship between и and т. Thus, we obtain 


(60) 


(61) 


(62) 


3629 . 
Ela] = в 
la] = a 15015 ” 
893 
ІЛЕДІ Қаладан 
А [n] = & 15015 ” 
1519 
Elz] = & + —— o 
Еа] 1+ T5016” 
7258 3748 
Е[22] = 62 — 8,6, + —— 9g 
[nt] = в 15015 ^^ * 45045 0^ 
1786 1084 
В[22] = 6,2 — — — 0,6, + — 62 
НЕЕ 15015 ^^ + 15045 0^ 
3038 1174 
Е [2:2] = 0,2 + —— 0,0 dau 
[s] Yt sors 2 45045 °° 
4522 6113 
Е [2123] = 6,2 — —— 6,0, + — — — 6,2 
Blea] E 715015 “+ 225225 0^ 
422 163 
Е |2123] = 6,2 – —— 6,6, — 0,2 
[ааз] 1 3003 19% » 2, 
626 193 
E [2231 = 6,2 + —— 66 + —— 62. 
pn тво 0 T agp е 


From these values, the variances and covariances of the order statistics 
are found to be 


(63) . 


621,011 
ci? = Elz?] — (E tei ater pto 
1 [22] — (G[r.]) 25,050,025 ^^ 
: 514,219 
ms EIU see 
ї [57 = lee) 25,050,025 7” 
e = Elet] — (Ela): = 59500 gs, 


25,050;025 
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319,824 : 

4 =E - E|z]|E -———— — 9, 
(64) 012 [2а] [zd [os] 25,050,025 7 

159,264 f 
——— —— y 
25,050,025 

258,048 


= - -------- 0,2, 
cu = Efans] Bll [д] 25,050,025 ^ 


оз = E [x23] — E[z]E[z:] = 


Lower moments of linear estimates of the mode 
А linear function g e 
(65) ® = wti + wore + wars 


of the order statistics 71, 2, хз in a sample of size three from the distri- 
bution described by equation (30) has the mean value 


(66) E[6,] = wviE[ri] + w;E [gs] + оз Е [vs]. 
‚ Its variance is 
Е[02] — (E[&]* = w0? + wo? + wo? 
+ (шашса + 2010313 + wwo). 


Combining (66) with (60), and (67) with (63) and (64), and employing 
the weights in (49), (50), and (51), we obtain 

(68) E[&] = 6, — .0092025370,, 

(69) Е[8!*] — (Е[@])* = .0124648780,*. 

The bias, obtained bysubtractimg6, from (68),istherefore — .0092025370;, 


while the variance ig given by (69). Adding the square of the bias to 
the variance, we obtain 


(70) E[G — 6,)?] = .0125495640,2, © 
Which is the mean square error of the estimates. These results are ' 


shown in Table 2 in the column headed 4 ‘Maximum likelihood? esti- 
Mates,” 


To yield unbiased estimates of the mode of distribution (30) for all 
values of 0, and 62, the weights 101, s, ws must satisfy the condition 


(71) wE [n] + wE [z] + Е [13] = 6: 
for every б, and @. Combining (71) with (60) leads to 


(67) 


e 


126 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1953 


( 3629 8)» ( 893 «) 
SONS SOT А AN F ТУ, 


^ (6 +2895), 
ртг n 


(2) , 


which must be satisfied identically. Hence, the weights must satisfy 
the two conditions 


(78) ш + Vs + ws = 1, 
(74) 3629: + 893w: — 1519ш; = 0. 


Combining (67) with (63) and (64), the variance of any linear function 
of the order statistics т, со, and т; is found to be 


а aE {621,011w:? + 514,219w;* + 396, 501.5? 
+ 2(819,824wiw, + 159,264wiw;) + 258,048 ши) }. 
To find the regular unbiased linear estimate with minimum variance, 
we write down the expression 2 
621,011w;? + 514,219? + 396,501w;? 
(76) 7-2(819,824wiws + 159 ,264wiws + 258 ,048wws) 
+ м(ш + ш» + ws — 1) + М(3629ш, + 893w: — 1519ws), 


where № and № are Lagrange multipliers, and set its partial derivatives 
with respect to шз, ws, ws each equal to zero. After eliminating М апа 
Xe, We obtain the equation ^ 


(77) 613w, — 2497w; + 300w; = 0. 

Solving (73), (74), and (77) simultaneously for wi, 1%, and ws yields 
(78) ў Ші = 3,525,043/15,153,912, 

(79) Ша = 2,019,847/15,153,912, 

(80) Vs = 9,609,022/15,153 ‚912. 


The variance and mean square error of the estimates computed with 
these weights is shown in Table 2 in the column headed “‘Best’ un- 
biased estimates." 

The mean square error of the estimates considered here may be found 
directly by setting Ө, —0 in equations (61) and (62) and then computing 
the value of the expression Y 
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VE [n] + wE [2] + wE [22] + 2 (wi; E [заса] 


(81) + wiwsE [ys] + та E [аз ]), 


which is now equivalent to 
E mU (18,740? + 5,4200? + 5,870; 
+ 12,226ши» — 8,150wyws + 1,930ww;). 
The regular linear estimate with minimum mean square error may be 
found by writing down the expression : 
18,740w? + 5 ,420w2? + 5,870? + 12,926,» 
— 8,150шуш + 1,9300: + (wi + w: + ws — 1), 


and then setting the partial derivatives with respect to t, и», and ws 
each equal to zero. After eliminating А, we obtain the equations 


(83) 


(84) 12,627w, + 693w: — 5,040w; = 0, 

(85) 10,188w, + 4,4550, — 4,9050; = 0. 

Solving these two equations and (73) simultaneously leads to 
‚ (86) ш = 18,095/74,867, 

(87) 02 = 10,055/74,867, 

(88) Ws = 46,717/74,867. 


The bias, variance, and mean square error of the estimates obtained 
from these weights are shown in the last column of Table 2. The weights 
for all three types of estimates covered by the table are shown there in 
decimal form to permit ready cémparison. 
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ESTIMATING THE RATIO BETWEEN THE PROPORTIONS 
OF TWO CLASSES WHEN ONE IS A SUB-CLASS OF 
› ТНЕ ОТНЕВ 


Jack M. ELKIN ; 
U. S. Railroad Retirement Board 


1. Consider а population of size М, of which piN items have the 
characteristic А, of which, in turn, paN also have the characteristic 
B; ie. the proportion of A-items which are also B-items ін p2/p1. 
For example, ру шау be the proportion of individuals eligible to retire 
among a groun of individuals covered by a retirement program at the 
beginning of the year and р:/рі the proportion of eligibles who retire 
during the year, or p: may be the proportion of families in а given in- 
come class among а group of families in all income classes and pa/]i 
the proportion of families in the specified income elass who own their 
own homes. А straightforward problem is one of estimating ps/pi 
from a simple random sample of size n drawn from N. The problem as 
stated in the title is, of course, equivalent to one of estimating the ratio 
of the overlap between two overlapping classes to one of the classes. 

Sometimes the actual value of pı or ps is known beforehand. On the 
"theory" that it is always bétter to use а known than an estimated 
datum, one might be tempted to estimate p2/p: by estimating the 

' unknown proportion from the sample and combining that estimate with 
the known proportion. This, however, is not necessarily the best course 
to follow. If we consider the “better” of two estimates to be the one 
with the smaller coefficient of variation, it can be shown, for a suf- 

ficiently large sample, that 


(a) even if рз is known, a better estimate of p2/p: (or of р1/рз) 


can be obtained by estimating both proportions from the sample if 
p:» pi/(2—3), or, alternatively, if р: <2р:/(1--рг), and 

(b) even if pı is known, a better estimate of р/р: (or of р1/рз) 
can be obtained by estimating both proportions from the sample, 

regardless of the values of p; and ps. б 
Тһе criterion in (а) cannot be applied exactly in practice, since р: 18 
not known, but its sample value can almost always be satisfactorily 
substituted for the universe value in the inequalities. 

By way of illustration, suppose that, on the basis of a simple random 
sample taken of 100,000 employees covered by a pension program at 
the beginning of a year, it is estimated that 5,000 were eligible to retire 

- during the year and that 3,000 of these did retire. Suppose, further, 


that the exact number of those who retired is known, from another | 
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source, to be 3,200. Since .03 >.05/(2—.05), the number of retirements 
ав а proportion of the number eligible for retirement should be quoted 
as $ or 60 per cent, rather than 3.2/5 or 64 per cent. On the other 
hand, suppose that the sample estimate of the number who retired is 
2,000, but that the exact number is known to be 2,100. Since .02 
<.05/(2—.05), the proportion should be quoted as 2.1/5 or 42 per ~ 
cent rather than £ or 40 per cent. If it is the number eligible to retire 
that happens to be known exactly from another source, it should 
nevertheless be disregarded, and the proportion should be calculated 
from the sample numbers in all cases. 

The conclusions stated in (a) and (b) above follow easily from a com- 
parison of the coefficients of variation of p/pi', ps/py, and р: /pi 
(or of their reciprocals), the dashes being used to indicate sample ' 
values. The expected value, variance, and coefficient of variation of 
D: [ру are given, respectively, by 


„(=> 
Pı Pı 
(>) „А – п p(nm — рг) 


P" М ША p! 


NTN = 1 1 
ere Se С 
р Nn \p р 
Where the approximations assume that n is sufficiently large.' If the 
reciprocal of p;'/p;' is expanded іп a Taylor series in the neighborhood 


of the expected value of р/р, it can be seen that, for sufficiently large 
n, * : 


z(=) 2р 

pz Та 

(22). =" nine ° 
pi Nn те 


CoR 


ume, if the Taylor series expansion is applied to 1/py’, it follows 
tom the well known expressions for Е(р1) and V (pi!) that 


ку ШЕ Deming, Some Theory of Sampling, рр. 452-54, quoting a proof by F. F. Stephan. Stephan’s 
excludes the possibility of a zero value for the denominator proportion. 
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Mai 
p Nn p? 


eee ee | 


For completeness, we have the familiar formula, 


(СУ. т (су. IE "GG T 1). 


The approximate formulas for the coefficients of variation are asymp- 
totic expressions and so are the efficiency comparisons based on them. 
2. A useful application of these results can often be made, even 
when the parameter under investigation is simply the proportion of 
cases in the universe possessing a specified attribute.? Thus, suppose 
the proportion (pi) of A-items is desired, the proportion (рз) of B- 
items is known, the B-items are а sub-class of the eme, and 
727 p: /(2—p1), or, alternatively, ру <2p2/(1+-p2), (where рт is the | 


sample value for pı). Under these conditions, it is preferable to obtain 
the sample value (pz) of ps from the same sample (assuming the 
B-items can ‘conveniently be counted along with the A-items), and 
use ру рә/ ру instead of р as the estimate of рі. The advantage of this 
procedure can be measured by the ratio 


су (н 


«Б 

Сто) La — p) 

For example, if р, =.2 and p2=.15, C.V.(pí/ps/pz) =.6 C.V. (py). 
Similarly, suppose the proportion (p2) of B-items is desired, the pro- 

portion (ру) of A-items is known, and the B-items are а sub-class of 

the A-items. It is preferable to ДОЗА the sample value (pi') of pı 

from the same sample (assuming the A-items can conveniently be 

counted along with the B-items), and use p:pi/pi instead of рә! as 

the estimate of рә. The advantage of this procedure can be measured 


by the ratio ү 
y. (% 2) 
р’ T [ Pı — P2 | 
C.V.(pz) p(l — »)J.' 


For example, if р1=.2 and рз = .15, C.V.(pipi/pi) 5.5 С.У. (рз). 


1 Зее also W. E. Deming, op. cit., рр. 165 ff. 
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0. PROBLEM t 


НЕ recognition of periodicities, their amplitude and length, is one 
T the main tasks of the application of statistics to long-range data 
and especially to economie phenomena. Although numerous theorems 
have been developed, the question of how to recognize a cycle remains 
open, since the complicated mathematical methods used in the peri- 
odogram and correlogram analysis may generate both real and fac- 
titious cycles and may conceal real ones. 

Therefore, the study of cycles is a fertile field for nonscientific pro- 
cedures. Since even a constant value in a*certain interval can easily be 
reproduced by a Fourier series, cycles with superimposed epicycles and 
third and fourth cycles have been “found” in nearly every domain that 
may be represented by numbers. Different observers have found differ- 
ent cycles in the same series—not to mention the great cycles of history 
leading to forecasts, good advices and policies advocated by their 
Prophets. This abuse has discredited the whole theory of cycles. 

In contrast to these procedures, we examine here a specialized dis- 
tribution function. Applications to recurrent phenomena of known 
Period will be given in а second article. Since the distribution function 
has only two disposable parameters, and since we consider using it 
only in connection with such phenomena as seasonal or diurnal varia- 
tions—where the length of period is definitely established from non- 
Statistical considerations—we feel that the chance of generating false 
cycles is small indeed. 

i Тһе distribution function under discussion was tabulated because 
it „possesses important theoretical properties of the linear normal dis- 
tribution and because it has practical applications to such meteoric 
a as the direction of the wind, which is an angular or “cir- 
ee Variable. Wind directions are represented clockwise on the 
E P irom north to east, south, west, and back to north in the same 

AY ав an angle varies from zero to 2r. Each direction or angle has а 
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certain frequency and the sum of the frequencies is the number of ob- 
servations, diminished by the number of calm conditions. The question 
them arose of extending the use of this function to time series with 
seasonal, or possibly diurnal, components. Assume n events happen 
during a year, or a span of several years. These events might be storms, 
or deaths, or the completion of automobiles, Each event happens at a 
certain date. If this date is considered as the chance variate o, the 
distribution of events over the year may be considered a circular dis- 
tribution of an angle. 


„1. STATISTICS OF A CIRCULAR DISTRIBUTION 


We first consider the purely empirical procedures which are used 
especially by climatologists in order to characterize circular distribu- 
tions. These descriptive measures will prove basic for the construction 
of the theory and the estimation of the parameters. 

Let n points be situated in some fashion on the circumference of a 
unit circle: then a,(v=1, 2, - · - , n) are the angles of the radii drawn 
from the center. To characterize such circular observations their vector 
mean has been used: this is clearly identical with the centroid of the 
set of n points. To calculate it we introduce the rectangular coordinates 
т,=сов op, y,—sin o. Then 


12 1g 
(1.1) де = 
n 1 "n i 


defines the vector mean. 

For our purposes it is, however, more convenient to express this vector 
mean in polar coordinates. Let @ and аг be a solution of the equations 
2-4 cos a, ў=@ sin ao. It is well known that this solution is unique 
unless 2— j—0, and is given by i 


(1.2) a= (+ 3%! 


(1.3) а = arctan — 
т 
where the quadrant in which a lies must be determined by inspection 


of the signs of 2 and ӯ. Equation (1.2), which defines the vector strength 
4 at the mean direction a, may be written from (1.1) 


(127) е / ( È sin Ж t«( > бы a) ; 


Mure re earn 
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Finally the mean direction o» satisfies the equation 


(137 аша ш ` 


The statistic œo is analogous to the average of a linear variate; the 
meaning of à will be explained later. 

The observations o, may be given individually or, and this is mostly 
the case, in grouped form, i.e. as frequencies p, attributed to certain 
intervals, say months, of equal length.! In the individual case, we cal- 
culate the sine and the cosine for each observation, and obtain œo and 
4 from (1.27) and (1.3). The grouped case permits a simplification. 
Assume that the frequencies for the 12 months are givén, and assume 
furthermore months of thirty days each. Then one day corresponds to 
one degree. We attribute the frequencies to the midpoint of each month 
and assume, say, the middle of July as preliminary value for a». Then 

_ the observations can be arranged in the following scheme: 


TABLE 1 
SCHEME FOR A CIRCULAR DISTRIBUTION OVER THE YEAR 


North 
360° ” 
330° June July August , 30° 
300° May September 60° 
West 270? April October 90? East 
240° March November 120* 
210% February December 150° 
January | 
180° 
e South 


Ым —————  ————— 


The mean direction isobtained from the following trigonometric rules: 


(1.4) > cos o, =July —Jan.-+0.86603 (Aug.—Feb.+June—Dee.) 


vel 


+0.5 (Sept. — March 4-May — Nov.) 
(L8) У) sin «„= Oct. — Apr. +0.86603 (Sept. — March 4-Nov. — May) 


v=] 
+0.5 (Aug. — Feb. -Dec. — June). 


а corrections for length of month are desirable in practical calculations will depend on the 

dies а inement desired, and on the amplitude of the seasonal movements. When the seasonals are 

February, т 07 10 рег cent or less, variations in length of month will produce noticeable irregularities. 

given а. 507 all has 10 реб cent fewer days than January. The details of the corrections will be 
ven in the second article, 
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Of course this procedure can also be applied to any other starting point 
after the appropriate rotations. 
A circular distribution may be graphically represented by a linear 
histogram. Numerous arrangements are possible, including: 
(1) Listing the months in their normal order from January on the left to 
December on the right; 
(2) Placing the modal month in the middle with one-half of the sixth follow- 


ing month at either end; 
(3) Locating the calculated mean (or mode) in the middle. 


A circular distribution may also be plotted on polar coordinate paper. 
Here the usual procedure is to trace the frequencies of the various 
months as radii vectores at the appropriate angles; the result is a 
polar wedge diagram corresponding to the linear histogram. By this 
procedure, however, the area of each wedge varies as the square of the 
frequency, which means that if two different distributions of, say, 100 
observations are plotted, their areas will in general differ. To conserve 
areas, Leighly [8] proposes to trace the square roots of the frequencies, 
and this procedure, henceforth called aequiareal, is used in the follow- 
ing. An obvious alternative is to use aequiareal polar paper, which 
may be constructed by graduating the radii according to a square root 
scale. h 

Two limiting cases of circular distributions are immediately evident. 
The first is the uniform distribution, where any angle is as likely to 
occur as any other. Then the density of probability f(a) = 4v (а meas- 
ured in radians) is obviously independent of o. The cumulative prob- 
ability function—defined in analogy to the linear procedure by ФР/да 
=f(a), with the boundary conditions F(0)=0 and F(27) =1—is simply 
Ғ(о)-а/2л. Pólya used an n-dimensional analogue of this assumption 
when he investigated whether the stars are distributed at random over 
the celestial sphere—also when he calculated the mean distance be- 
tween two points for any number of dimensions |14, 16]. 

The other limiting case occurs when all the angles are the same. This, 
of course, has no practical or theoretical interest, but the asymptotic 
approach to this case is worth discussion. As an example, a set of angles 
might be normally distributed with mean 0 and standard deviation 1°. 
Then no sensible fraction of the total frequency would be found in most 
of the circle—for example, the are from 5° to 355° would be virtually 
void. Asymptotic distributions of this type do not differ sensibly from 
the linear normal. They have been used for the analysis of the Brownian 
movement and the paths of the beta rays [6], and they are the basis 
for Schelling's work on the most frequent particle paths in two and 
three dimensions [18]. 
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The problem of circular distributions becomes significant when a sen- 
sible portion of the observations may occur over all or nearly all of the 
circumference. As in the linear case an infinity of circular distribytions 
exists. Among them a few, possessing common desirable properties, 
have been studied. The distributions studied are all symmetrical, al- 
though many observed phenomena show asymmetry. Moreover, these 
distributions possess опе mode—i.e., one most probable direction. From 
both properties—symmetry and unimodality—it follows that nowhere 
is the probability less than its value at the antipode of the mode— 


o? ө 


«—— CARDIOID 


AEQUIAREAL. 1809 


Свален 1. The cardioid and the aequiareal trace of the cardioid distribution, 
© 


° 
henceforth called the antimode. Nothing is lost in generality if we put 
the mode at zero, north on the map, and the antimode at т, south; 

А simple example of a circular distribution is the sine fiinction modi- 
fied in such a way that the density of probability is non-negative and 
the area under the distribution is unity. This leads to the symmetrical 
distribution 

Ха)-%(1--сов о). 


Тһе density of probability is 1/7 at the mode and zero at the antimode. 
еп (а) is traced in polar coordinates, the resulting curve is the 


Cardioid. Graph 1 compares the cardioid trace of f(a) with the aequi- 
areal trace, 
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Other circular distributions сап be generated by wrapping an un- 
limited linear distribution around a circumference. This problem has 
been studied by Lévy, Marcinkiewicz, and Wintner [9, 10, 20], who 
have thus treated the normal and the Cauchy distributions. The nor- 
mal wrapped-up distribution with mode zero may be written 


т DR [onm 
(16) —= E etan = Уу д" costa (0 Sa S27) 
OVIT kao 2r > 


Т ва 


where Ing= —o?/2. Graph 2 shows it traced оп aequiareal polar scale for 
42-50% and 100°, corresponding respectively to 9=.68333383 and 
121608421. When c» 77? approximately, this angular distribution may 
be computed from tables of elliptic theta functions. Theoretical advan- 


09 


IN 


че 


180° 
Grapx 2. Wintner's circular distribution in aequiareal scale. 
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270°. 
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tages recommend this distribution; it presents, however, serious dif- 
ficulties, especially if the method of maximum likelihood is to be used 
to estimate its parameters. P 


2. THE CIRCULAR NORMAL DISTRIBUTION 


Gauss has shown that the normal distribution can be derived by 
what is now called the principle of maximum likelihood and a single 
assumption that the mean is the most probable value. In 1918 Mises 
[11] applied the method of Gauss to a circular variate and derived the 
distribution that is the subject of this paper. His procedure is essen- 
tially as follows: Mises asks for such a distribution f(£), where £ is an 
“error of observation" £,— а, —о, that the ratio of the a posteriori to 
the a priori probability (the likelihood function) of a “true value” œ 


upon т observations o, * + + , e, is to be a maximum for that o given by 
equation (1.3^). The likelihood function is 

Mr). 

у=] 


The postulate means that 
h f a) Ü 
мі Д(о,— ao) е 
should hold together with equation (1.3). Since these two sums are 
equal for arbitrary values of the a,, the equality must hold term by 
term. This functional equation has the solution 
| ДЫ = без, 


where the two positive parameters C and k are linked by the condition 


|, "ёа = 1. 


Consequently E 


C21 /Гечш 


= 1/2xI0(k) 


where I(x) is the Bessel function of the first kind of pure imaginary 
argument. Therefore 


ek cos (aap) | 
(2.1) А fla) = CRIT i 
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This expression is henceforth called the circular normal distribution 
since it was derived in a way which is strictly analogous to the classical 
derivation of the linear normal distribution. Pélya [15, pp. 129-132] 
has shown that the distribution (2.1) is the only one that shares certain 
important properties of the linear normal. 

The density function has a mode at o; and an antimode at ao+7. 
Тһе quotient of the densities at the mode and at the antimode is 


(2.2) Лао) (оо + т) = €^. 


Тһе larger the part of the distribution situated in the neighborhood of 
the mode, the larger is k. Therefore, this parameter k is & measure of 
concentration. For k=0 the distribution (2.1) degenerates into the 
uniform circular distribution. When Ё is significantly different from 
zero, there is evidence of the existence of a cycle, 

To get further insight into the nature of the parameter k consider 
the points of inflection for the distribution traced on a linear scale. Let 
the mode ов be zero, then the points of inflection o; and аҙ are the 
solutions of 


79-0 
whence from (2.1) 
k sin? a = cos a. 


The larger the parameter k the nearer are the points of inflection to 
the mode. "Thefpreceding equation;has the real solutions 


1 Te 
(2.3) eosa = = E + f imm —. 


For large values of k, the points of inflection are approximately 


(2.3') 5: P аз = + l/A/k 
while for the linear normal distribution, the points of inflection are 
оз = tc. 


If k is large, its reciprocal 1/k influences the circular normal distribu- 
tion in the same way аз о? influences the linear normal one. 
This analogy leads to the second degenerate case. Let k be large; then 


the distribution is highly concentrated about о. Introduce a reduced 
variate 0 by writing i 


5 » 
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a — ao = B/A/k. 
The distribution of the variate 8 is 
Л8) = const е? сов (В/У), * 


Since the argument of the cosine is small, we may neglect higher powers 
of 6 in the exponential and obtain a normal distribution 


(2.4) f(8)- = е-ізіз 


with mean zero and unit standard deviation. In fact, for large k the 
reduced distribution f(8) converges to this linear normal distribution. 
For the estimation of the parameters consider the likelihood function 


Г = (29)-^Ig-"(k) exp pÈ сов (а, — a) |. 


Equating 0/da log L to zero we obtain equation (1.3’). Equating 
9/98 log L to zero we obtain 
Ik) 
I«(k) 
The expression on the right is easily transformed with the help of 


(1.37) into 
Lf Ges 


which by (1.27) equals the vector strength 4. Therefore the parameter 
k is the solution of x 


(2.5) L(k) - al(k) = 0 
where e 
T(k) = Ij(E). 

Equation (2.5) was solved for k by interpolating in a table of Bessel 
functions [16a]. Table 2 gives k as a function of the observed vector 
strength а. Tables for testing the significance of observed à are under 
Preparation. 

K. Arnold [1] has generalized the distribution (2.1) to the bivariate 


саве, considering the center of gravity of points on a sphere. He has 
also generalized equation (1.6) to the bivariate case by using Fourier’s 


T n 
= — У) cos (a, — ой). 
п 1 * 
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TABLE 2 
PARAMETER k А8 A FUNCTION OF VECTOR STRENGTH 4 
I(k) 
D. jy AH 
То(®) 
а k ГЫ а k ô? 
.00 0.00000 T .45 1.01022 +52 
.01 .02000 1 .46 03889 54 
.02 .0400 Г 1 AT 06810 57 
.03 .06003 1 .48 09788 62 
.04 .08006 4 .49 12828 64 
.05 .10013 2 .50 .15932 69 
.06 » 12022 3 .51 19105 12 
.07 .14034 5 .52 .22350 77 
.08 .16051 5 .53 .25672 83 
.09 .18073 6 .54 .29077 88 
.10 0.20101 5 .55 1.82570 93 
11 .22134 8 .56 36156 100 
12 24175 7 .57 39842 107 
18 ‚26228 8 .58 43635 115 
114 .28279 9 .59 47543 123 
115 30344 10 .60 .51574 138 
116 32419 9 .61 :55738 142 
НИ .84503 12 .62 .60044 156 
.18 .36599 14 .63 .64506 166 
19 .. -88707 13 .64 69134 183 
.20 0.40828 13 .65 1.73945 197 
.21 .42962 14 .66 . 18953 216 
22 .45110 15 .67 84177 236 
.23 147978 17 .68 .89637 260 
.24 .49453 16 .69 .95357 285* 
.25 .51649 18 .70 2.01363 315* 
.26 .53863 20 73 .07685 351* 
27 .56097 19 72 .14859 891% 
28 .58350 22 73 .21425 437* 
29 .60625 22 ‚74 .28930 493* 
30 0.62922 23 75 2.36930 558* 
.81 .65242 25 ‚76 .45490 633* 
32 67587 26 MT 54686 728" 
33 69958 27 .78 64613 837* 
34 72356 29 .79 75382 972* 
35 74783 31 .80 .87129 1136* 
sp 77241 31 .81 3.00020 1341* 
i ‚79730 34 .82 .14262 1598* 
25 .82253 36 .88 30114 1918* 
.84812 37 .84 47901 2331* 
40 0.87408 39 .85 3.68041 2860* 
Bü ae 42 86 -91072 3557* 
45 a 2 .8 4.17703 4480 
44 .98207 48 
45 1.01022 52 
а ЕДЕМА ЦА ы с хау ЛА АЕ ене 
* 5 has been modified from .69 to .87. 


1 


| 


THE CIRCULAR NORMAL DISTRIBUTION 141 


o9 
Pu c e 
a “ 1 
d je Аз 
тка: Y 
yu NUM 
ie --k: NEN 
Ta 7 ет KT ~ d Ni 
y ^E 
A 7 N [2 
n^ "а 
n NGS 
i um 
Y5 IN 
P \ i 4 
t^ i 
Vy it 
\ ! 


909 


180 
Gnarn 3. The circular normal distribution in aequiareal scale for k =0, 1, 2, 4. 


equation of heat flow, which yields the normal linear equation as а 
special case. 4 


3. CALCULATION OF TABLES 
The density function (2.1) is relatively simple to calculate from tables 
of the Bessel function, the cosine, and the exponential function. The . 
density function is presented to three decimals in the form 
“р. Va) = Væ) 
which is adequate for tracing aequiareal frequency ovals on polar 
Paper (Table 3). This table was calculated to five decimals and checked 


© e 
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by differencing before cutting down to three decimals. It was further 
checked by multiplying together the values corresponding to supple- 
mentary angles. For tracing, the values given in Table 3 must be mul- 
tiplied by 4/n/m, where m stands for the number of wedges—in our 
case т= 12. 

Graph 3 traced in aequiareal polar scale shows the influence of the 
parameter k. For increasing values of k the area from 90 to 270 degrees 
shrinks to an insignificant part of the whole area. For the sake of com- 
parison the distributions are also drawn in linear scale in Graph 4. 

The areas of the circular normal distribution function, Table 4, 
posed the main problem of calculation. Originally it was desired to 
compute this table to 7 decimals for k=0.1 (0.1) 4.0 and a=2}° (23°) 


40 
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СвАрн 4. The circular normal distribution in linear scale for k —0, 1, 2, 4. 
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180°. This was actually done, апа the present table is an abridgment. 
For caleulation three methods were considered: quadratures, expan- 
sion in a series of incomplete beta functions, and expansion in a series 
of sines. The sine expansion was finally accepted— partly because it 
was suitable for use with ап І.В.М. 602 caleulating punch available 
to the authors, and partly because adequate tables of the Bessel func- 
tions I,(k) were obtainable. 

The method of quadratures was suggested in 1950 by Dr. H. R. J. 
Grosch, then of the Watson Scientific Computing Laboratory, Colum- 
bia University. The method has the advantage that the complete in- 
tegral, те ов = dx, is obtained without reference to its definition as a 
Bessel function, and the Bessel function is then used a8 a check. This 
method would be suitable for longhand calculation, and it seemed 
ideally suited to calculation on the card programmed calculator at the 
Watson Laboratory—a possibility briefly considered. The method was 
not, however, well adapted to calculation on the 602 calculating punch. 

The method of incomplete beta functions derives from Mises [11, p. 
499], who gives the formula 
(8.2) f екв 11: = >> a сов" zdz 

0 nao Neo 
and indicates that it was used for computation by S. Weieh. Arnold has 
likewise stated, in conversation with the authors, that he used a formu- 
la equivalent hereto in computing his 6 decimal table [1]. The integral 
on the right side of (3.2) is recognized as the incomplete beta function 


E Т 511 -) 9-9 dt, where v = віп? 2. 

0 
To evaluate it there are available, besides expansion of the integrand 
аз а polynomial in cos mz: 

(1) The tables of Legendre [7, p. 178] available only for n=2, 4, 6. 

(2) The table of Soper |12, p. 179], for z—7/2. This was used in 
preliminary caleulations. 

(3) The table of Wishart [12, p. 239]. This was investigated but its 
auxiliary argument, which we shall call 2, defined by 


z 
Е = 2n!" {ап —› 
2 


renders it unsuitable for extended calculations for fixed x; moreover it 
fails for n <9. а 
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(4) The extensive beta-function table of Pearson [13]. This requires 
the auxiliary variable 2—cos* 2, rendering interpolation everywhere 
necessary and requiring, for moderately small z, some such artifice as 
that of Pearson and Stoessiger |12; p. 176]. 

The method of sine series rests on the expansion 


(3.3) eos = 1,8) +2 У) I.(k) cos nz 

nel 
which was essentially given by Frullani [4, p. 502]. Integration term by 
term gives : 


Ns 
ek вов zdy 9 
0 а = 
34 i ADMIS EDDA а рет na. 
(3.4) 5 2 қа” (k) sin na 


‚ УЪеп the tabulation of the distribution by the above formula was first 
considered, the then most readily available tables of 1,() were those 
of A. Lodge [5, p. 309]. The interpolation of these for intervals of 0.1 
and the extrapolation by recurrence to Г.а, which preliminary calcula- 
tion showed would be necessary for 9 decimal accuracy throughout, 
were not works we were prepared to undertake. Then Dr. J. C. P. 
Miller generously provided us with proof sheets of the Royal Society's 
fortheoming volume of Bessel functions of integral order, proceeding at 
intervals of 0.1 and going up to Iis. 


The connection between the two series expansions should be briefly 
noted. We have 


ык, LJ a2 
(8.5) 2.00) + 230 Ink) cos nz = У) e, cos nz У) PULSA ORE, 
net n=O о 2"+24!(n + t)! 
(6 -< 1,6, = 2, n > 0) 3 
ux Ly SP tanus 
"ол T2" $3 iin — 01 


Hence, ! if we take the beta function series and neglect terms beyond that 
in k^, it is equivalent to replacing each Bessel function by its ascending 
power series and truncating these at the terms in Кл, The double series 
converge absolutely, and so we have incidentally proved that the co- 
efficients of the cosine series are the I,; this was not the method used 
by Frullani, who, writing before the modified Bessel functions had 


2) > 
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been defined, obtained the differential equation satisfied by the co- 
efficient of cos nz by integrating by parts and thence derived the power 
Series. 

The use of punch card equipment to calculate series such às (8.4): 
is now fairly widespread. According to Comrie [2], “Тһе first scien- 
tific application of the Hollerith—in 1928—was to the summation of 
harmonie terms, or, in other words, Fourier synthesis." Comrie's 
description of this job [3] is still worth reading; the reader should note 
carefully the difference between English and American machines [2, 
pp. 156-157] and remember that the 80-column card is now nearly 
universal. 

In our calculations, values of 2/zIo(k) were determfhed by hand cal- 
culator to 10 decimals and punched into cards for k= [0.1 (0.1) 4:2]. 
I, and n were punched; then 27,(k)/zI«(k) апа 21,(k)/nwIo(k) were 
formed successively on the 602 calculating punch. The angles nz, 
151514, х= [0° (2.5°) 180°], were reduced by hand to the first 
quadrant and the sines were punched to 9 decimals from the National 
Bureau of Standards table [19]. 

The term a/z, being a multiple of 1/72, was calculated on an I.B.M, 
405 tabulator by progressive summation. The sines were multiplied 
by the Bessel function quotients on the 602. Although the summations 
could have been made on. this machine also, they were made on a 
tabulator (both 405 and 402 were used) in order to have a printed 
-record of the terms; the process differs from that in [2] principally in 
the fact that the tabulators used in our calculations are equipped to 
subtract from the card. A 513 summary punch was connected to 
punch a 9 decimal table of the function (with errors up to 7 units in 
the last place); this table was printed and examined for gross errors. 
It was then checked by forming first to sixth differences, inclusive; in 
both the k- and a-directions, on the 602. From the 9 decimal second 
and fourth differences, in each direction, modified second differences 


were computed by the usual formula е 


ôm? = ô? — .183980%. 


The function and the modified differences in the two directions were 
then rounded to 7 decimals by adding 5 to the eighth decimal. The 
modification and the rounding were checked by summing the cards in 
groups on the tabulator, and the table was then printed. ' 

The present abridgment, Table 4, was prepared by taking cards cor- 
responding to increments of 0.2 in k and 5 degrees in a and differencing 
them five times in each direction; the fifth differences were used only to 
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control gross errors іп the fourth. The modification, rounding, check- 
ing, and printing then proceeded as with the 7 decimal table, the 5 
being added, of course, in the sixth place. : E 


4. BUMMARY 


The mean direction o and the vector strength 4 of circular data as 
used by climatologists are caleulated from (1.27) and (1.3). The same 
statistics may be used for data other than weather. With data grouped 
in monthly class intervals the calculations of these statistics are facili- 
tated by (1.4) and (1.5). The assumption that the maximum likelihood 
estimate of a “true value” shall be given by the mean direction leads 
to the circular normal distribution (2.1). The central location parame- 
ter of this distribution is estimated by the mean direction о. The 
parameter k is estimated from à by means of Table 2. Тһе radii for 
plotting the distribution in aequiareal polar scale are obtained by 
multiplying the values given in Table 3 by 4/n/12, where n stands 
for the total frequency. Areas of the distribution are given in Table 4. 
Numerical examples will be given in a forthcoming article. 
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BOOK REVIEWS 


Mathematics Essential for Elementary Statistics. Revised Edition. Нае М. 
Walker (Professor of Education, Teachers College, Columbia University), New 
York: Henry Holt and Company, 1951. Рр. xiii, 382. $2.75. 


Leo А. Goopman, University of Chicago 


HIS book, а self-teaching manual for the adult who has forgotten or 
never grasped elementary mathematies, still remains without serious 
competitors. 

The first edition (1934) of this book was reviewed by Herbert Toops in 
this Journal, Volume 30 (1935), pp. 483-84. This revised edition contains 
new sections dealing with simple operations with сотто and decimal 
fractions, inequalities, linear interpolation, permutations and combinations, 
the binomial expansion, the multinomial expansion, ratios and proportions, 
trigonometric identities, second degree equations, degrees of freedom, and 
weights and measures. Many of the explanations appearing in the 1934 edi- 
tion have been expanded. In order to keep the book small (the first edition 
contained 246 pages) all those sections of the first, version which related 
specifically to the content of statistical method have' been eliminated. 

Every topic discussed is accompanied by selí-scoring exercises, answers 
for which are provided in a key at the back of the book. Some of these answers 
(e.g., p. 9 problem 14, p. 34 problem 11 (2), p. 46 problems 4 and 5, and 
others) are incorrect, but this should not confuse the student Who has 
mastered the topic. 


Elements of Statistics. C. G. Lambe. London: Longmans, Green and Company, 
1952. Pp. viii, 112. 


Caru F. Kossack, Purdue University 


d sn author sets forth the object of this text as follows: “This work is 
‚А written primarily for the use of engineering students, but it is hoped that 
it will be a useful introduction to the Theory of Statistics for scientists 
and experimentalists of all kinds.” Such a goal is a very ambitious one and 
one which has interested the reviewer for some time. It embraces the difficult 
question of whether it is necessary to have separate introductory courses in 
Statistics for each major area or if a single first course will suffice. Since this 
text also includes a section on Quality Control, one finds himself faced with 
the further problem of whether this phase of statistics can be combined with 
other phases or needs to be treated in an independent course. Unfortunately, 
the text being reviewed does not help much if one is interested in answering 
Such questions. 

Í In fact, the text is so brief, consisting of only 104 small pages, that one 
eels that many of its weaknesses can be attributed to this limitation in 
Space. The materia] covered falls into three major catagories: elementary 
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statistical methods, the theory of errors, and quality control. The approach 
lacks inspiration. It follows the usual outline: frequency distributions, 
measures of dispersion, continuous distributions, etc., each chapter being 
verf restricted in its nature, consisting of around ten pages including the 
exercises. The large number of exercises associated with each chapter is to 
be commended, but one cannot help but feel that they have been introduced 
either as arithmetic projects or for mathematical gymnastics rather than to 
develop the student's appreciation of statistics. 

The chapter on the theory of errors is the most worthwhile one in the book. 
The material in this chapter deserves being considered for inclusion in more 
courses in statistics than it is today. It is apparent that the eight pages used 
by the author to discuss quality control are not enough to do the subject 
justice. Ф : 

One could mention particular difficulties encountered in reading certain 
sections, Even the first sentence in the book gives rise to some question, 
“Statistics may be defined as the scientific treatment of numerical data." It is 
apparent that the statistician will soon inherit the earth. Not enough atten- 
tion is given to explaining the real purpose behind the many statistical ideas 
and measures that are introduced. For example, the concept of independence 
of Pn variables is never really explained, let alone defined, although it is 
used. 

An improvement in the text could be achieved if more space and attention 
were given to the sections on tile theory of errors and quality control, with 
the other material restricted to an introductory section stressing more the 


basic concepts of statistics, especially those needed to appreciate the work 
covered in these two areas. 


Introducción а los métodos de la estadística. 1» Parte. Sizto Rios. Madrid: 1952. 
Pp. 205. Paper. 


PavL В. Hatmos, University of Chicago 


"qve is à charming and elementary book that fulfills, within the limits the 
author sets for it, the promise of the title; The mathematical level of the 
book is not advanced; nothing more high-powered than integral calculus i$ 
ever used. The topics discussed are: statistical tables, graphical representa- 
tion, frequency and probability, sampling, random variables, multidimen- 
sional statistics, correlation, regression, sampling distributions, and tests of 
hypotheses. Although at times the definitions are a little vague (cf. the defini- 
tions of statistical variable and random variable on pp. 35 and 36), the large 
number of examples, complete with detailed tables and graphs, is likely 40 
give the reader a sound intuitive grasp of the subject. A notable feature of 
the book (emphasized by Herman Wold in his preface) is the number, 
variety, and interest of the exercises. The reader’s statistical imagination 
will probably be stimulated when he is asked whether meat consumption in 
Madrid can be adequately studied by using samples of telephone subscribers 
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(p. 8), апа when he is requested to perform various intricate coin-tossing 
experiments that illustrate the possible presence of correlation in multi- 
dimensional situations (p. 97). As a pre-technical introduction, with the 
purpose of telling the experimenter what ideas he is likely to encounte in 
his study of statistics and what statistics is likely to do for him, the book is 
highly recommended. 


Statistik Tidskrift. Nr 1, 1952. Stockholm: Statistiska Centralbyrán, Januari 
1952. Pp. 83. Paper. 
oe Tidskrift published by the Central Bureau of Statistics of Sweden 
is a new journal whose main purpose is to give summaries of current sta- 
tistics at an earlier date than official statistical reports usually do. It will 
also contain articles on new statistical techniques, internatiónal statistical 
collaboration, reviews of books in statistics, and so on. 
Among the articles in this first issue the following two are of special inter- 
est to statisticians, in general: one by H. Cramér on statistical methodology 


and the other by T. Dalenius on operations analysis. 
U. б. 


А Guide to Tables of the Normal Probability Integral, National Bureau of Stand- 
ards Applied Mathematics Series 21, Washington, U.S. Government Printing 
Office, 1952, 15 cents. Pp. iv, 16. р 
Tu Guide" its Introduction says, “makeg available in one place certain 
information that will be helpful in utilizing the normal probability tables 

given in standard statistical texts and other important sources. . . . The aim 
has not been to present an exhaustive or historical compilation but rather to 
include the more important sources likely to be accessible in libraries of 
American universities, research organizations, etc., as well as some of the 
more recent, publications that educational institutions, research organiza- 
tions, and students of statistics might be encouraged to acquire." There are 
62 entries in the bibliography. e 

The tables are classifed into two main categories, direct and inverse, 
according to whether the probability is shown as a function of the unit 
normal deviate, or the deviate as a function of the probability. Each category 
is divided into five groups, according to the kind of probability tabulated: 
central, semi-central, two-tail, single-tail, or cumulative. “Parts I and II tell 
which sources give tables for the indicated types of segments of area under 
the normal curve and give the essential characteristics of the tables. These 
features are succeeded by a description of interpolation methods (Part TIT) 
especially applicable to certain of the tables mentioned. Then follow two 
brief comments [ie., two long sentences] . . . giving the relationship be- 
tween some types of normal tables” and the incomplete gamma and chi- 
Square tables. 

М. А. W. 
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Consumers’ Price Index. Report of a Special Subcommittee of the Committee 
on Education and Labor Pursuant to H. Res. 73, Subcommittee Report No. 2, 
House of Representatives, 82nd Congress, Ist Session, 1951, 39 pp. 


Jores Backman, New York University 


arty in 1951, the United Electrical, Radio, and Machine Workers of 

America published a study containing extensive criticisms of the Con- 
sumers’ Price Index. This study followed the pattern of the Meany-Thomas 
Report which was used to denounce the Bureau of Labor Statistics and the 
Consumers’ Price Index during World War II. To evaluate these charges 
and to determine the present adequacy of the Consumers’ Price Index, а 
special Subcommittee of the House Committee on Education and Labor 
was established. After hearing a number of witnesses, this Committee pre- 
pared the report under review. 

This report is an excellent brief survey of how the Consumers’ Price Index 
is constructed, its use, and its history. The Committee found that the UE 
charges were without merit and that the Consumers' Price Index “is an 
excellent index.” The Committee made several recommendations which ap- 
pear to be solidly based. Thus, for example, they recommended that the 


BLS make periodic studies of “changes in living costs due to changes in. 


non-price factors” and “the extent of changes in prices paid by wage earners 
and lower salaried clerical workers who moved from one community 10 
another.” Such studies would be useful particularly during periods char- 
acterized by major upheavals in the economy such as those which attend 


an armament program. The Committee properly suggested that the Con- | 


sumers’ Price Index should not incorporate these types of changes. 

The Committee also reviewed the suggestion that income taxes should be 
included in the index, It recommended a continuation of the present practice 
under which the BLS includes excise and sales taxes, but excludes income 
taxes, All citizens have a responsibility to pay their fair share of the cost 
of government. If income taxes were included in the Consumers’ Price 
Index, the net result would be to exempt from such responsibility all workers 
who are covered by escalator clauses or who received wage increases bas 
upon changes in the Consumers’ Price Index. The adoption of a general 
sales tax, however, would create serious problems in this connection and 
would require a reexamination of the BLS method of handling such taxes, 

The Committee also recommended that BLS should follow “a policy of 
keeping the index continuously under review, with revisions as required 
when important changes in buying habits occur, rather than infrequent 
complete revisions as has been the practice in the past.” If such a continuous 
review were made, there would be little need for the major revisions of the 
index such as the Bureau is now engaged in making. The Committee properly 
suggests that BLS “should be extremely cautious against making 600 ie 
quent changes of the index which are minor in character.” 

The Committee has recognized that to effectuate the changes it suggests 
would require adequate financial support for the Bureau. This in itself i5? 
contribution since very often such committees make recommendatio™ 


» E 
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without being willing to face the fact that their execution would cost money. 
It is very healthy to have periodic reviews of an index as important as the 
CPI. The Steed Committee is to be congratulated for a very workmarflike 
job. 


A Dollar Index of Soviet Machinery Output, 1927-28 to 1937. Alexander Ger- 
schenkron, assisted by Alezander Erlich. Santa Monica: The Rand Corporation, 
April 6, 1951. Pp. v, 352. Paper. Not published for sale; a limited number of 
copies are available on request from interested scholars. 


Јонм Crawrorp, United Nations 


"ms study is mainly devoted to the construction and appraisal of a new 
| index based on the physical quantities of 128 items of machinery pro- 
duced in Russia between 1927-28 and 1937 weighted with 1939 U. S. factory 
prices. This unusual index is described by the author as *part of a compre- 
hensive recomputation of the rate of growth of industrial output in Russia 
...]ustified by the manifold deficiencies of the official Soviet index of 
industrial production." 

Briefly, it is charged that the official Soviet indexes have grossly over- 

stated the real rate of growth of industrial output. The machinery index is 
regarded as exemplifying this tendency since it is felt to reflect in exaggerated 
form the main weakness of the Soviet indexes, namely, the introduction of 
new commodities at approximately their current prices during the period of 
general inflation subsequent to 1926-27, the weight year of the mdexes. 
This results in an upward bias because the output of the new items expanded 
faster than the old. Thus an examination of the index computed by Gerschen- 
kron reveals that 24 items were first included in 1931 and 21 in 1932. The 
value of these items in 1937 (at 1939 U. S. prices) was 3.5 to 4 times their 
value in the year they were first included, whereas the total index only 
doubled during the same period, 
; As evidence of the charge that the “1926-27 prices" of machinery are 
inflated (presumably in the sense that they contain prices of later years for 
commodities not produced in 1926-27), the author refers to data presented 
in the Soviet Plan for 1941. This source contains values of planned output 
for 1941 at current and at 1926-27 prices for a number of different, Ministries. 
These indicate that the 1941 planned output at current prices for the elec- 
trical and other machinery industries exceeds that in 1926-27 prices by only 
10 per cent, whereas 1941 values for the ferrous metal, coal, and woodwork- 
ing industries are more than double the values in 1926-27 prices. It is con- 
cluded that the similarity of values at 1941 and 1926-27 prices for the ma- 
chinery industries (in which the increase in physical production is known to 
have been large) is explained by the fact that so-called 1926-27 prices con- 
tain prices of subsequent years for commodities not produced in 1926-27. 
The author rejects as a sufficient explanation cost reductions in the machin- 
ery industry (said to have amounted to 45 percent over the period of the 
Second Five-year Plan, 1933-1937) on the ground that this reduction was 
insufficient as well as on the basis of statements of Soviet writers. < 
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In addition to the effect of the general price inflation, the weights are 
believed to be inflated also because the prices used for the new commodities 
are aid to be those which prevailed in the initial period of mass production 
before the usual inereases in productivity took place. 

Aside from the problem of new commodities, the use of 1926-27 prices 
as weights is considered to have imparted an upward bias to the index be- 
cause increases in output and productivity are positively correlated. "Thus 
the use of beginning-period prices as weights would result in an index showing 
a larger rate of growth than the use of end-of-period prices. This proposition 
is tested with U. S. price and quantity data for machinery items common 
to the years 1899 and 1909, 1899 and 1923, 1909 and 1923, and 1909 and 
1939: indexes Somputed with beginning-period prices as weights show from. 
two to seven times as large an inerease as when end-of-period prices are 
used. The longer the period of comparison the greater is the relative increase. 

To get around the problem caused by the “process of changing scarcity 
relationships” involved in the economic development of a country, and 
lacking a consistent set of Soviet prices for machinery, the author used 1939 
U.S. price relationships. His choice of U. S. prices is defended on the grounds 

of their availability and the fact that their use permits a comparison of the | 
absolute levels of U. S. and Soviet machinery production. The prices used | 
were drawn from Census of Manufactures data and from American business- 
men, who having dealt with Russia were able to identify and price American 
counterparts of Soviet machinery items. Use of the Census limited the period 
chosen to census years and from these 1939 was selected as being the most, | 


— p 


recent available for the purpose, data for 1947 being regarded as distorted 
by the war. According to the author, “The problem of using as index weights 
prices of a different country is essentially the same as that of basing an index 
ofa country’s output over a given period of industrialization alternatively | 
on prices pertaining to the beginning and to the end of a period.” It would | 
seem, then, that if the Russian machinery ‘industry during the 1930818 | 
regarded as having been at the same stage of economic development as, for 
instance, that in the U. 8. just prior to the first World War, the use of 1914 
U. S. prices rather than 1939 would have more nearly approximated Soviet 
scarcity relationships. | 
Тһе fact that іп 1937 the Soviet index exceeded 3.2 times the index com- 
puted by Gerschenkron (1927-28 =100) is explained by (1) an unknown | 
amount of downward bias in the latter measure; (2) pricing errors of the 
Soviet index; and most important of all, (3) the effects of using as weights 
for the computed index prices of a year (1939) that approximate the endof | 
the period (1937) and are prices of a more highly developed economy. Asthe | 
author states, “The use of American prices of the year 1939 is an altogether 
өші generis way of looking at Soviet Output. The index of Computed Output | 
therefore, is not а gauge by which the Soviet Indexes сап be adjusted." Ў 
Accordingly, the charge of bias in the Russian index still seems to те | 
on the validity of these propositions: that there was a substantial degre? E 
inflation in Soviet machinery prices after 1927-28, and that in a period 9 | 


| 
| 
| 
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economie development the use of base period weights leads to а higher index 
than weights based on end-of-period prices. + 

Whether the reader will agree with the author's conclusion that “witk all 
its deficiencies the [Gerschenkron] index should still provide a possible basis 
for gauging the development of a crucial branch of Russia’s industrial 
economy” will depend on the meaning attached to measures of industrial 
production. Aside from the pricing deficiencies of the Soviet index the ques- 
tion of the choice of beginning- or end-of-period weights must rest on the 
use of the indexes. As to the use of U. S. prices as weights, it would appear 
that the meaning of a production measure is anchored in its economic 
setting so that whatever phenomenon the new index can be said to measure, 
it is not the physical volume of Russian machinery output. e 

Students of production indexes will be indebted to Professor Gerschenkron 
not only for the general clarity of his description but also for the 288 pages 
of appendices which set forth the basic data and computations underlying 
the study. May the Russians take heed, 


Business Fluctuations. Robert A. Gordon. New York: Harper and Brothers, 
1952. Pp. xvi, 624. $5.00. 


Henry Н. Уплавр, The City College, New York 


[г Gordon has written an able and unusual volume on business 
fluctuations, placing main emphasis on actual cycles and minimizing the 
space devoted to cycle theory. The material is divided into three parts: 
the first, covering background understanding and tools of analysis; the sec- 
ond, the nature and causes of fluctuations; and the third, prediction and 
control, 

Part I starts with a discussion of the income and transaction versions of 
the quantity equation, and of national income, gross national product, and 
related concepts. Next the determinants of aggregate demand are considered. 
The relationship between consumption and investment is examined at 
length, both in theoretical, ex ante terms and in the light of statistical, ex post 
evidence, and the theory of the multiplier is analyzed. The first part ends 
with a discussion of fluctuations in investment and the role of the accelera- 
tion principle. The conclusion is that the acceleration principfe—either as 
an explanation of investment fluctuations or in conjunction with the multi- 
plier as an explanation of entire cycles—is a weak reed without noticeable 
empirical support, and that the Keynesian determinants via liquidity pref- 
erence and the supply of money are of no more practical use. The detailed 
coo discussion is therefore necessary if fluctuations are to be under- 

od. 

The second part, on the nature and causes of fluctuations, constitutes the 
core of the volume. It starts with a presentation of the meaning and available 
measures of business activity, including aggregative measures such as the 
GNP, bank debits, and employment and index numbers such as those of 
the Federal Reserve and the American Telephone Company. Next the char- 
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acteristics of private-enterprise economies making for secular and cyclical 
change are set forth, with population growth, technological developments, 
altezations in tastes, “structural changes,” and wars placed in the former 
eategory and production for profit, price-cost relationships, and the use of 
unstable money in the latter. Stress is laid on the inevitable tendency 
toward fluctuations under these conditions, which the much emphasized 
relationship between saving and investment summarizes but does not ex- 
plain. The kinds of economie change which result are set forth, including 
seasonal, secular, cyclical, and irregular movements. The nature and past 
behavior of secular movements is covered, together with Kondratieff long 
waves (which are considered to be unconfirmed) and building cycles (which 
are considered, significant in connection with business cycles of unusual 
severity). This clears the way for a presentation of how the economy behaves 
during business cycles, which relies heavily upon the studies of the National 
Bureau of Economic Research (whose method of measuring the cycle is 
described at this point), and leads in turn to a discussion of how the economy 
generates business cycles. The most original part of the presentation is 4 
welcome distinction between minor and major depressions: the former are 
likely to result from over-expansion of inventories or overproduction in 
particular areas and be ended by liquidation of inventories or absorption of 
the specific excess capacity; the latter are likely to result from impairment 
of long-term investment opportynities and be ended by accumulated needs 
to replace worn-out capital and renewed expansion of growth industries, 
with building ‘activity sometimes playing an important initiating role. 

Two of the last four chapters of Part II are devoted to business cycle 
theories and two to a description of American economic developments from 
1919 to the Korean War. The theorists discussed are grouped as “business- 
economy” theorists (Mitchell, Veblen, Pigou, and Metzler), monetary 
theorists (Hawtrey, Wicksell, and Warburton), a “capital-shortage” group 
(Hayek, Spiethoff, and Cassel), a “partial-overinvestment” group empha- 
sizing the exhaustion of specific investment opportunities (Schumpeter, 
Hansen, and Robertson), and a group stressing Óverinvestment іп rela- 
tion to final demand (Clark, Harrod, Kalecki, and Hicks). Underconsump- 
tionists and weather theorists are mentioned briefly, while an appendix 
is devoted to econometric model-builders. The final two chapters describing 
actual fluctuations in detail are іп many ways the most original and inter- 
esting in the book. That they are urgently needed is obvious, now that 50 
many students have no recollection of anything except the prosperity that 
has prevailed since World War II. The presentation contains many sugges- 
tions as to the factors responsible for the fluctuations that occurred without 
sacrificing balance in the description of the events themselves. 

The final section of the volume is devoted to prediction and control. Two 
broad methods of forecasting business activity are distinguished: historical 
pd and “cross-section analysis.” In the first group only the statistical 
indicators of the National Bureau are found to be promising, but various 
methods of projecting the GNP and its components (and relating them 10 
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an industry and ultimately to а firm) are found to be worthwhile. Inter- 
national aspects of instability are also discussed in this section, including 
the way in which fluctuations are transmitted and the measures needed to 
reconcile internal and external stability. But the bulk of the sectiün is 
devoted to the achievement of domestic stability, including not only full 
employment but also the elimination of significant price changes. The extent 
to which these objectives can be realized by controlling consumption and 
private investment, by using monetary and fiscal policies, and by influencing 
wages and prices are discussed at length. A final chapter summarizes the 
extent to which governments in the United States and elsewhere are today 
formally committed to achieve full employment and the actual preparations 
that have been made. In attaining stability major importance is attached 
to fiscal policy, but the broad setting in which such policy must operate and 
the way in which other policies can be used to help is also made clear. 
Throughout, the emphasis on how to achieve stability is unusual and will 
be welcome to those whose students insist on knowing what can be done 
about economie fluctuations. 

Professor Gordon's volume is a solid and substantial achievement. If I 
have any reservation, it is that the book is a bit too solid for its original 
objective of providing a text for an upper-division course required of all 
business administration majors at the University of California, Almost all 
texts dealing with business cycle theories have felt an obligation to include 
theories which have been discredited professionally; Professor Gordon seems 
to have felt a similar obligation to include almost everything that* business 
cycle economists have been doing in recent years, whether or not the results 
have been particularly fruitful. The result is a book with quite unequalled 
references to the periodical literature which should be of unique value to the 
graduate students just starting in or to professional economists who have 
not kept up with recent developments. But I believe—though it is clearly a 
matter of judgment —that the volume would have been of greater value to 
undergraduates if perhaps a third ог so of the material had been omitted 
and more emphasis plaged on crucial areas of analysis and policy. 

To illustrate: most of the tools considered in the first section are never 
subsequently used to justify the space (and student energy) devoted to their 
development. Again in Part II I believe that the 140 pages leading up to 
how cycles are generated might well have been condensed in the interest of 
an elaboration of the forces involved, which are presented rather sketchily. 
I feel, for example, that too little stress is laid on the increasing postpona- 
bility (with little sacrifice of current satisfaction or production) of expendi- 
tures on consumer and producer durables, investment, and replacement at 
the end of a period of prosperity, and on the increasing inability to postpone 
expenditures (without immediate repercussions on satisfaction or produc- 
tion) after an extended depression. The discussion of fiscal versus monetary 
Policy in Part IIT also seems to me to reflect an unawareness—admittedly 
one widely prevalent at the present time—of the fact that fiscal policy is 
generally subject to the same limitations as monetary policy in checking 
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an inflation when a commitment to stabilize government bonds at any 
particular level has been accepted—except perhaps to the extent that а 
budget surplus inereases saving and so permits a lower level of interest rates 
to bè maintained without inflation. Certainly, holding of the proceeds оға 
surplus іп idle Treasury deposits, which Gordon recognizes is usually neces- 
sary if the surplus is to be fully deflationary during а period of prosperity, 
will in most cases have the same repercussions on the government bond 
market as a similar reduction of deposits in the hands of the public through 
the use of monetary policy. Finally, I wish that Gordon had discussed price 
stability somewhat more realistically. Despite a preference for constant 
prices in the long run (p. 487) and a suggestion that stabilization may re- 


quire “some sort of supervision” over wage and price policies (p. 498), the | 


subsequent detailed discussion (pp. 551-558) offers little except education 
of labor leaders and the possibility of letting unemployment rise as high as 
6 or 7 per cent. Actually constant prices mean that average wages, including 
fringe benefits, cannot increase as much as a nickel an hour a year.’ Given 
such facts as the steady increase in wages from 1933 to 1937, when unemploy- 
ment was far higher than 7 per cent, and Mr. Lewis’s recent characteri- 
zation of an effort to cut his miners back to a wage increase of rather more 
than 10 per cent as exhibiting a sadistic “penchant for robbing miners’ 
babies of life-giving milk,” I feel that more discussion of just how labor 
leaders are to be educated would be of educational value to all of us! 

But these are largely matters of preference and emphasis. The sort of 
book I would like for undergraduates doesn’t exist, and, the proof of the 


pudding being in the eating, I shall be using Gordon’s volume in my under | 


graduate business cycles course next semester. 


Private and Public Investment in Canada, 1926-1951. 0. J. Firestone. Ottawa | 


Department of Trade and Commerce, 1951. 4°. Pp. 254. Paper. 
Murar Hasray, National Bureau of Economic Research 


"qus should be something of value in this complex and formidable 
monograph for every serious student of investment. 


In origin, it is a source book of Canadian fixed capital expenditures in the | 


period 1926-51; and this aspect of the work finds expression in Part 1, 
which consists of 64 quarto-size pages of tabular material on Canadian in- 
vestment flows, and Part III, which comprises 38 pages of explanatory 
material on concepts and definitions, sources and estimating techniques, ап 


the quality of the estimates. The impressive thing about this side of the 
work is its scale. The estimates cover all outlays on durable productive assets | 


by every sector of the economy. This means not only gross investment, but 
also repairs and maintenance, and not only by the business community, but 
also by such institutions as churches, schools, universities and hospitals, by 
, governments both on durable physical assets and on development and con 


1 Assuming а 2 per cent annual increase in productivity per man hour and no redistribution of it 


come or increase in unemployment. As divider : i г 
ее MGE ратан are about 5 per cent of wages and salaries, ODD? 
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servation of natural resources, and by individuals on housing. The figures 
are shown separately for the private and publie sectors, and also for in- 
dustrial groupings based on the Canadian Standard Industrial Classification, 
defined to include publiely as well as privately owned establishments. 
Finally, both gross investment and repairs and maintenance are divided’ 
into outlays for construetion and outlays for machinery and equipment. 
Most of these distinctions can, in turn, be crossed; and several of them, such 
ав “the publie sector” or industrial groupings such as “manufacturing” and 
“utilities,” are themselves multiple classifications. The end product is a set 
of 123 tables—each containing, on the average, four non-duplicating time 
series—which presents a picture of Canadian investment that cannot be 
matched either in comprehensiveness or in detail by pubdished estimates 
covering the United States. These estimates are of course of varying quality, 
and every user must assure himself that their defects are not critical for his 
purposes. But they have been put together on a systematic plan and appear 
from the descriptions to be at least as reliable as the best comparable esti- 
mates compiled for the United States. 

More than half the book is taken up by Part I, which constitutes a 188- 
page interpretative summary of the stark facts contained in the statistical 
sections. This summary is comprehensive rather than intensive, but it adds 
up to a highly instructive account of the growth and changing structure of 
the Canadian economy as disclosed in a quarter-century of investment ex- 
perience. Every reader of this summary will be impressed by the author’s 
wide knowledge of Canadian economic development and by his sure sense 
of the special factors which serve to diversify investment patterns in different 
regions and industries. And if the reader is familiar with United States 
experience, his attention will be arrested by a number of striking parallels. 
For the diversity of Canadian resources is scarcely less than that of our. 
own, and the Canadian people are embarked on a course of industrialization 
which recapitulates much of our economic history at a substantially stepped 
up pace. Thus, in the last quarter century, manufacturing has gone ahead 
of agriculture in the number of persons employed and now contributes 
roughly three times as much to national income, Utilities, too, have under- 
gone a phenomenal expansion until, in relation to population, Canada’s 
railroad, telephone, and electric power networks rank among the largest and 
most advanced in the world, And, finally, Canada has pioneered in working 
out viable relationships between free private enterprise and a federal 
system of government. In all of these respects Canadian experience supple- 
ments our own, and Dr. Firestone’s monograph should find a grateful audi- 
ence among students of investment in this country. 


Faychological Analysis of Economic Behavior. George Katona, New York: 
eGraw-Hill Book Company, 1951. Pp. ix, 347. $5.00. 
ManaanzT С. Rew, University of Chicago 


ре author has set for himself the task of describing a psychological 
approach to economic analysis and the currentresearch in tke field of 


e 
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economie behavior,” to be done in such a manner as to reach “the general 
publie as well as the expert." (p. v.) To the reviewer the book has more of 
the earmarks of а text for undergraduate students than a monograph for 
experts, or of а book for lay readers, however interested they may be "in 
present-day American economic life," Furthermore it appears to deal with 
specialized research rather than what can be described as “the current 
research." 

The book is divided into five parts: Part I. Problems and Tools; Part II. 
Consumer Behavior; Part III. Business Behavior; Part IV. Economie 
Fluctuations; and Part V. Research Methods. Part II contains more than 
one third of the pages, and almost all the statistical data presented about 
economie behavior. Parts III and IV deal with concepts, theories and 
generalizations based on non-statistical observations, despite the author's 
judgment that “statements not susceptible of empirical validation have no 
place in psychology." (p. 29.) Unsubstantiated and in fact highly dubious 
statements occur; for example, “Business and labor leaders have become 
increasingly aware of their responsibility to the publie as well as of the fact 
that their own interests can best be promoted by striving for what is in the 
general interest." (p. 294.) One chapter in Part I is devoted to “What kind 
of psychology?” In this the student is presumably to be introduced to the 
“solidly established tenets of modern psychology." (p. 8.) However, the 
psychological ideas actually иве јр later discussion are so much a matter of 
common knowledge that the person who omits this chapter is not likely to 
feel handicapped. 

Part V dealing with Research Methods comprises 34 pages. Attention 
centers on the investigation of attitudes rather than of behavior as such and 
types of schedules and questions and their use in getting reports on attitudes. 
The text ends with two pages devoted to “research projects” in which the 
author emphasizes the need for examining in so far as possible the effect of 
one factor at a time, He expresses the belief that considerable progress has 
been made in this direction. He is, however, very vague or silent as to the 
extent to which a systematic field of knowledge has’ developed in the psy- 
chology of understanding economic behavior or the concrete gaps that exist 
and should be a challenge to research workers. 

With minor exceptions data presented in Part II are those provided by 
the Consumer Finance Surveys conducted by the Survey Research Center 
of the University of Michigan for the Federal Reserve Board. Out of 34 for- 
mal tables 32 present data from this source, one describes the components 
of the national budget in selected years and one presents hypothetical data. 
These tables furnish a useful digest of the findings of the Consumer 
Finance Surveys and include some data not previously published. 

The author states at the outset that “it will be shown in the book that 
studying the motives, attitudes, and expectations of consumers and business- 
men contributes to the understanding of spending, saving, and investing. 
... The results of psychological-economic studies will supplement the 
traditional analysis of supply, demand, income and consumption." (p. 4) 
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The expectations raised by considerable emphasis at the outset on *motives, 
attitudes and expectations" are likely to be disappointed for many readers. 
Out of 32 tables presenting findings from the Consumer Finance Surveys 
only 11 inelude data on attitudes, motives and expectations. The author 
does not, however, devote much space to considering the unique contribu- 
tion of the two types of data. He does emphasize that people are able to 
predict change in future income but little or no recognition is given to the 
fact that the incomes of individuals and of families tend to move through 
well-defined cycles so that individuals tend to be able to predict the course 
of their income even when unable to predict the course of national income. 
The author does state *Perhaps primarily, the analyst сап rely on the 
results of cross tabulations between forms of behavior and certain charac- 
teristics of people." (p. 75.) The context would seem to imply that the 
author had in mind things other than “motives, attitudes and expectations,” 
although contact with certain sources of information is included in eharac- 
teristics. The data used have a uniqueness with respect to characteristics 
of people that for many will be the major attraction of this book, Whereas 
the earlier analyses of consumption and current savings in relation to eco- 
nomic levels were largely confined to the effect of income of one year the 
analysis presented here in addition includes income of two years, extent of 
change in income and assets and especially in liquid assets. The author is, 
however, handicapped in his analysis of numegous variables by the necessity 
of using a small national sample. . 
Because of the excessive attention to variations in behavior factors 
among individuals readers may feel that they lose sight of a major fact of 
interest—namely the mass response to certain types of change. Some of the 
problems of aggregation are touched on but only in a rather tentative fashion, 
The book is marred by occasional carelessly worded or superfluous state- 
ments. Some confusion, for example, exists in the meaning of “fact” and 
“knowledge” and “response” of interviewees. It would hardly seem nec- 
essary to make an extensive investigation to disclose that “people’s mo- 
tives, or needs, desires and hopes ... were conflicting.” (p. 73). Nor does 
it seem necessary to hypothecate after the accumulated observation of 
human history, not to mention the development of philosophy, that “some 
of our expenditure may be found to be nothing but expressions of whims or 
the results of emotions.” (p. 69.) In view of the classification of money spent 
on durable goods as expenditures the reader is likely to be a little puzzled 
at the statement that “the net sum of those relatively infrequent outlays 
which are not used for the satisfaction of immediate needs is considered 
Saving.” (p. 151.) We are also told that “There were people who felt that they 
were seriously affected by the ‘depression’—mostly people with low incomes.” 
(b. 281.) One is left to wonder whether they were people with low incomes 
because of the depression. After a general discussion of “probable reactions 
to policy" without any empirical evidence the author concludes that “it is 
often possible to differentiate between appropriate and inappropriate re- 
Sponses to economic policy and to economic change.” (p. 288.) 2 
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WESLEY CLAIR MITCHELL: THE ECONOMIC SCIENTIST* 


Арогғ A. BERLE, JR. 
Columbia University ° 


НЕ legacy of a scholar is his teaching. The tribute to a scholar is 

the esteem of his peers. The immortality of a scholar is the influence 
his thinking exerts on generations yet to come. Wesley Clair Mitchell 
can fairly be named the greatest American economie scholar of the 
twentieth century. Laying а foundation for the evolution of scientific 
economies was his precise and monumental contribution. Why it was 
done at all and how it was achieved is a subject of first importance. Be- 
cause of Mitchell's work, economies can fairly claim now to be called 
а "science," just as, for lack of a similar centribution in political science 
and sociology, these sisters of economics still grope in the gray fog that 
lies between the area of speculation and the explored territory of gen- 
eralization and conclusion from observed facts. 

Colleagues, disciples and students bear affectionate and discerning 
witness in this volume to his achievement. It is fitting that they have 
done so because of their relationship with the man; but, more impor- 
tant, it is useful that they have done so because the book is a case- 
study of the process by which ecénomics was translated from a kind of 
theology into a more or less exact science. The essays here collected by 
Arthur Е. Burns thus are far more than a memorial; they are а con- 
tribution to the study of economie methodology, and cannot fail to be 
causative, 

Before Mitchell tackled the field of economic method, economics 
consisted of speculative development of hypotheses formulated by 
earlier, intelligent observers of their environment. These attempted 
generalization from their limited and personal observation. Such was 
the work of Adam Smith, of Ricardo, of John Stuart Mill, and later of 
Marshall and of Taussig. Their speculations opened-a wide door (of 


* A review of Wesley Clair Mitchell; The Economic Scientist, edited by Arthur F. Burns, (№ 
onal Bureau of Economic Research, Inc., 1952). 
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this, more later). Without them; no science could ever have been con- ч 
structed; but acceptance of their hypotheses without demonstrating | 
their factual base clearly left economics in the realm of inductive phi- | 


losüphy rather than of scientific demonstration. It remained for 


Mitchell to pioneer the job of isolating identifiable economic factors, | 


determining analytic definitions of phenomena, working out methods of —' 


accurate measurement and continuing observation, verifying relation- | 


ships or at least establishing probability of a relationship of cause and 
effect between phenomena. Classification, description, measurement, 
establishment of interrelationships, generalization, looking toward pre- 
diction, this in economics as elsewhere is the basis of scientific method. | 
As а young man, Mitchell was unsatisfied by speculation without proof, 
and he accordingly introduced scientific method into economics with 
modesty and integrity, combined with a good humor which made him 
at once the most beloved as well as the most respected of contemporary 
social scientists. 


How he did it is partly a saga of life, partly a demonstration of close | 


thinking. In one sense, he never transcended an early conception | 


blocked out when he was studying at the University of Chicago. There | 


he was working on that grimmest of assignments, a Ph.D. thesis, de- 
signed to be “A History of the Legal Tender Acts.” But it evolved into | 
“History of the Greenbacks," and became eventually a powerful analy- 
sis of monetary theory in the United States. His intellectual godfathers 
in that period were surely diverse: Thorstein Veblen, brilliant, tor- 
mented, biting; John Dewey, challenging conventional process of 


thought; Jacques Loeb, psychologist, and physiologist, devoted with- | 


out limit to scientific method. This constellation was something for 
students to dream of; a brilliant analytic mind, a sceptical but human 
philosophical mind, and a rigorous. scientific methodological mind. 
Mitchell absorbed wisdom from all three; and an economist was formed. 
Tn unpredictable result, the concept of a true science of economics 88 


really generated through Mitchell by a tempestuous dreamer, a revolu- | 
tiouary philosopher and a pessimistic physiologist. | 
The consequence of this amazing combination of stimuli (carefully | 


analyzed in an essay by Professor Frederick C. Mills) was Mitchell's 
application of methods known in physical science in the form of a mas- 
sive monograph on “Business Cycles.” It was magnificently and at 
curately factual—a description rather than an explanation. At the 
close of his life, Mitchell was still working on business cycles. He stil 
б considered that he had not solved the problem. But as a result, the | 
"business cycle" had been identified if not explained. The National 
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Bureau of Economie Research, in whose organization Mitchell had 
played the leading part, had made notable progress in the vast task of 
analyzing, describing, and collecting accurate data in sequence. Meas- 
urement and actual calculations of *national income" and of *gross na- 
tional product," the analysis of the rise and fall of price levels, all stem 
from this life-time work. *The notion that inquiries should be framed 
fromm the start in such. а way as to permit of testing the hypothetical 
conclusions,” the notion of sequence, the concept of consecutive 
growth as opposed to older ideas of assumed but unexplained (and ac- 
tually non-existent) equilibrium, are perhaps the most causative of 
Mitchell's ideas. Hundreds of men are working now on lines of investi- 
gation opened up by them. е 

The bibliography of his work, beginning with “Тһе Quantity Theory 
of the Value of Money" (1896) and closing with *What Happens dur- 
ing Business Cycles: A Progress Report? (1951), is a catalogue of hun- 
dreds of studies, large and small, of exact, factual data, so handled that 
each piece of measurement or description could be related to and used 
in connection with other collections of fact-data. Inspired by his in- 
fluence, a host of men have since followed his example. At the close of 
his life, great areas of economie theory have been or can be tested by 
reference to relevant facts. A demonstrated conclusion can be dis- 
tinguished from unverified theory. Economics has moved, steadily for- 
ward; it is fairly on its way. y 

The essays here collected show all this, though each naturally follow: 
the turn of its respective writer. Mills, perhaps chiefest heir of Mitch- 
ell’s thought, places his professional sketch next to a tender and pene- 
trating memoir written by Lucy Sprague Mitchell. The two are not 28 
far apart as might be suppósed. Research which tests dogma against 
fact is always controversial and ifi some measure impersonally destruc- 
tive and it may even be revolutionary. An honest man may find him- 
self intellectually bound to refute by his facts the most tenaciously held 
doctrine, and is not popular when he does so, A true scientific scholar 
must set his lance above mischance and ride the barrière. The profes- 
п disciple and the wife find much common ground in their descrip- 

ions, + 

Mitchell was always working on two levels: thus, in 1920 he was 
founding the National Bureau of Economic Research with the imper- 
Sonal scientific end of measuring magnitude of national income and 
Measuring its principal components. But in the same year he joined 
James Harvey Robinson and Alvin Johnson in founding “The New 
School for Social Research” as a part of his life-long protest against 
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restricting freedom of speech in academic institutions and somewhat 
later he was endeavoring to salvage the remaining years of Thorstein 
Veblen from the dark limbo into which Veblen’s stormy soul had 
steered him. Mitchell gave the best of his thought to governmental re- 
search and planning, which necessarily exacts a high degree of organ- 
ized conformity. But to the end of his life he worked at experimental 
education—which is anything but conformist. To say of him, as does 
John Maurice Clark in his reprinted Memorial address, “American 
economics has lost its undisputed first citizen,” is a plain statement of 
fact. 

Method, well discussed by A. B. Wolfe, is worth a word. We shall 
always have With us the academic individual who fears to make a hy- 
pothesis or propound a theory and confines himself to the less contro- 
versial task of measuring phenomena. But the fact is that measuring 
phenomena without some hypothesis to determine what facts to collect 
is sterile business. Even a fact-collector must have a hypothesis—else 
a count of paving stones in Times Square would be just as significant 
as calculating the size and distribution of the national debt. Mitchell 
never fell into the error of first accumulating facts and then construct- 
ing a hypothesis. He knew better than most that the hypothesis guides 
the collection of facts; but that from the collected facts, the hypothesis 
can be tested, revised, or discarded and new hypotheses can be made. 
On the other hand, he never committed to any body of theory, and 
thereby provoked a major debate in certain quarters. 

In the one critical essay included in the volume, Paul T. Homan ob- 
serves that, as a sequel of Mitchell’s method, increasingly sound scien- 
tific work is being devoted to detailed studies, disclosing the manner 
of the workings of portions of the economie system “to the ultimate end 
of assisting in intelligent social guidante. Realism is Allah, and Mitchell 
is his prophet. Paradise may be around the corner.” Unless, suggests 
Homan, you commit to some sort of theory, you are lost. Implying that 
Mitchell did not commit himself, he ironically adds that there is no 
scientific reason to suppose that the “voluntary process of cumulative 
causation is amenable to intelligent social control. . . . There is no guide 
here but faith.” Which is another way of saying that value judgment of 


Mitchell’s scientific economics depends on philosophical premises; that | 


Mitchell had one but did not state it and could not test it; that for all 
his realist detachment, Mitchell’s work was really energized by quite 
homely, non-scientific convictions, and that he might as well hav? 
stated them as theory. 


The implicit criticism is, of course, that without a comprehensive 
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theory which Mitchell never embraced and never formulated, the sig- 
nificance of his work can not be appraised. This criticism is fair enough 
though not too damaging. Actually, as а result of Mitchell, any eco- 
nomic theory, formulated or hereafter formulated, will be held Palid 
only if verified by observable data, and testing for some time to come 
will probably be by use of the statistical tools Mitchell forged or by 
methods he developed. Development of method was Mitchell's chosen 
function; fundamentally he was a master toolmaker. One suspects he 
also cherished (by inheritance from Veblen) a hidden and unfulfilled 
desire to write a great Utopia, but consciously limited himself to sta- 
tistical work lest he become intoxicated with his own dreams. But— 
men are led by their dreams and their dreamers, as politicians know all 
too well, Statistical analysts can merely record the results of their pur- 
suit, in hope that succeeding generations may perhaps be better guided 
in selecting what manner of dreams and dreamers to follow. 

И Mitchell’s name is attached to anything, it will be to a school of 
economic thought now known as “institutionalism.” In Wolfe’s words, 
“Economics must be a concrete and realistic study of institutional 
habits and relations, and these must be regarded not as fixed but as 
continually changing; hence economics is not only an institutional but 
a dynamic and evolutionary science.” To this is linked another apho- 
rism that economics has only one justification—the furtherange of eco- 
nomic welfare. Men, en masse, behave in certain standard’ ways, accord- 
ing to uniform though complex patterns. These patterns are in constant 
evolution, but in part at least they can be measured by statistics. Pat- 
terns of human action tend to be reflected in institutions or persistent 
groupings; they include, for instance, the American corporation (the 
connection in which this reviewer first met Wesley Mitchell) or the 
habits and framework of consumer credit or possibly a Communist 
commissariat or twenty other similar institutional operations. These 
institutional patterns become, in themselves, factors of greater or less 
force in economic action, All this transcends what was commonly 
known as “orthodox” economic theory. (Milton Friedman makes:this 
point with discernment in his chapter on “The Economic Theorist.") 
It accounts for Mitchell’s frank recognition that the money economy 
(commonly miscalled “capitalist” economy) is by no means the only 
Possible system in which economic principles have validity, or in which 
economic science can exist. 

ж ж ж 

Because the chief authors of these essays are economists, any lay 

Teviewer must submit his comments with diffidence. No criticism of 
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Mitchell is implied in the observation that his insistence on demons 
strated facts isolated him from some of the driving conflicts of his time, 
Facts are few and hard come by; to insist on them before asserting а, 
viewpoint does accomplish detachment. Yet detachment is, and in 
Mitchell’s case was, bought at a price, whose extent is just beginni 
to be visible. If, for example, the dominant motive activating hum: 
patterns ceases to be individual search for profit, and becomes rather ai 
reaching for power, individually or through collective groups, much of 
current economic theory would have to be revised and statistical con= 
clusions would assume different significance. Only in 1952 an есопо- 
mist’s first attempt at reappraising the American capitalist system i 
this sense has veached print in Kenneth Galbraith's American Capit 
ism: The Concept of Power. The thesis would not be undreamed of in 
Mitchell’s philosophy—but it would be excluded from his economic) 
work. 
Again, certain bench marks sacred to economists, such as the “basio 
interest rate," may be a result of inevitable tides; or they may be 6 
result of more or less consciously managed monetary systems. In 
vidual search for security and comfort, indeed, may perhaps give way 
under some circumstances to a mass desire, or at least willingness, to bt 
dedicated or subjected to direçted collectivism, as for example, in t| 
Soviet Union. It is quite conceivable that as modern technology pro 
gresses, the power motive rather than the profit motive may become) 
the key to institutional development. It is at least possible that t 
titanic world struggle of today arises from the tensions created by so! 
such shift. It could be, in a word, that economics apart from politic 
Science is meaningless abstraction. Detachment from political struggle 
may not be a luxury permitted to economists in days to come. Mitchell 
never ignored such struggles; but he never mixed his thoughts abouti 
them with his economie work. р : 
In like manner, Mitchell's insistence that measures looking toward 
social reorganization must be based on “established knowledge" 18 
meiely Utopian. Statesmanship usually has to meet crises arising outi 
of the unknown or unexplained. Historians may later supply the “es 
tablished knowledge" needed for full diagnosis; but the statesman 
businessman, or politician rarely has the luxury of awaiting that happy 
day. He must act in accordance with someone's best estimate of prob 
abilities, and the event will prove him right or wrong. For an econos 
mist to refuse to estimate in these circumstances is scientifically credite 
able, but it deprives the politician of the putative best estimate. 
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For laymen, much of the economists language is almost unintelligi- 
ble. But they understood Mitchell. Most educated people today have 
some idea of the national income of the United States and of its fluctua- 
tions, and the national government uses his statistical instrument 48 а 
chief factor in determining fiscal policy. Laymen can and do make use 
of the growing body of data covering its use and its distribution. Busi- 
nessmen аз well as economists have a gradually growing knowledge of 
the relation of money and credit to the rise and fall of employment and 
trade, and to the creation and distribution of goods and services. 
Thousands of investors who never heard of Mitchell know about the 
business cycle, and seek to anticipate it in conducting their affairs. 
Hundreds of bankers have unostentatiously changed their financial 
theory because of his critique of the quantity theory of money. Impact 
of his ideas is found all over the American governmental and business 
system; it is no light recognition of his accomplishment. 

Yet even today Mitchell’s greatest premise has not been fully under- 
stood. Nearly thirty years ago he indicated it: и 

In becoming consciously a science of human behavior, economies will lay: 
less stress upon wealth and more stress upon welfare. Welfare will mean 


not merely abundant supply of serviceable goods but also a satisfactory 
working life filled with interesting activitigs. . . . 


He dreamed of developing “criteria of welfare” as well asof statistical 
measures for income and wealth. In his view, this science of economics, 
struggling to be born, was but one tool in the greater search for life— 
however conceived. 

Mitchell’s great contemporary, Professor Joseph Schumpeter of 
Harvard, a few days before he also joined Mitchell in the Valhalla of 
Scholars, wrote in closing thé record: 

Here was a man who had the Cis to say, unlike the rest of us, that 
he had not all the answers; who went about his task without either haste 
Ог rest; who did not care to march along with flags and brass bands; who 


was full of sympathy with mankind's fate, yet kept aloof from the market 
EN ; who taught us, by example and not by phrase, what a scholar shouid 
е. 


THE VELOCITY OF TIME DEPOSITS* 


чь GEORGE GARVY 
Federal Reserve Bank of New York 


не rate of turnover of time deposits belongs to the outer reaches of 

the terra incognita of the behavior of liquid assets. “Liquid assets” 
and not “money” is intentionally used because the question whether 
time deposits are part of the money supply is one on which there are 
differences of opinion among monetary theorists. Even within the 
Federal Reserve System there is no unanimity: while the Board of 
Governors ingludes in the money supply not only time deposits of com- 
mercial banks, but also deposits of mutual savings banks and of the 
Postal Savings System, the Federal Reserve Bank of New York does 
not include any of these items.* 

Several writers on monetary problems have inquired into the relative 
velocities of circulation of various components of the money supply, 
but only after making their own decisions on the proper definition of 
money.’ Their explorations, moreover, generally involve certain as 
sumptions as to the velocity (and other dimensions) of one or several 
components of the total money supply for which empirical data are not 
readily.available. Such assumptions with respect to time deposits range 
from Keynes’ view that savings deposits have a “velocity of zero” to 
Burgess’ more generally used estimate of a rate of turnover of twice 8 
year.‘ Most recent writers either confine themselves to the statement 
that the rate of turnover of time deposits at commercial banks is low 
or very low, or fall back on Burgess’ estimate. 

It is not intended to review here the ?ontroversy on the “proper” 
definition of the money supply. Indeed, a meeting of minds is more 
likely to be advanced by an inquiry into the behavior of time deposits 


км This paper is based, in part, on a report prepared for a technical committee of the Federal Reserv? 
m, 
s "Bee, for instance, A. Marget's exhaustive discussion, in The Theory of Prices, Vol. I, Ch. XVI. | 


3 Bee, for instance, the article “Money Supply" in the Monthl; ie i iness Condi- 
М Ü ly Review of Credit and B: 88 
tions of the Federal Reserve Bank of New York, Nov. 1951. «i ens ipi 


3 See Marget, op. cit., Vol. I, p. 463, footnote 10. 
< W. R. Burgess, “Velocity of bank deposits," Journal of the American Statistical Association, 1908) 
pp. 727-40. Burgess’ estimate, however, was based on fragmentary data obtained in 1922 from only 


six New York City banks for short periods, usually one month in each case. The validity of his 6° | 


«аана e for 72 country аз а whole may be challenged on the basis of data which have be 
vailable subsequently and which show that time deposits i parts e 
less rapidly than in New York City. Ax Du voee 
5 See, for instance, L. Currie, The Supply and Control of Money à й ш 
2 1 опеу in the United States (Harvard 
versity Press, 1934), р. 50. Angell does not inquire into the turnover of time deposits because “е 


appears to be no data for measuring their rate of i i мом 
(McGraw-Hill Book Co., 1936), p. 94. шо тен 
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than by further exploration of the logical possibilities of defining attri- 
butes of money. If it is agreed that some types of liquid assets, such as 
currency and demand deposits, are money par excellence, the approach 
might be to inquire whether time deposits (and other classes of liquid 
assets) show similar velocity and similar cyclical and seasonal patterns.* 
Tf business firms and consumers consider time deposits as equivalent to 
money, the behavior of time deposits should bear a close resemblance 
to that of currency and demand deposits. For the present study, the 
behavior of demand deposits will serve as a convenient frame of refer- 
ence since comprehensive data on the velocity and patterns of use of 
currency are lacking. 

Because of the paucity of data for commercial bank ¿ime deposits, 
the scope of this inquiry has been broadened to include savings institu- 
tions. While the great bulk of deposits of savings institutions consists of 
savings deposits of individuals, time deposits of commercial banks’ also 
include time deposits of their trust departments, of business firms, and 
of government units. Such deposits are frequently held in the form of 
certificates of deposit or in open accounts. The shift in the composition 
of time deposits accounts, as will be shown below, for the divergent 
movement of the rate of turnover of time deposits and of savings de- 
posits. $ 

Data on the turnover of total time deposits at commercial banks are 
available for only a short period prior to World War II. They have been 
collected from about 400 “money market” banks (weekly reporting 
member banks) in New York City and 100 other leading cities for the 
relatively short period from September 5, 1934 to February 1, 1939. 
Data on the velocity of savings deposits only, were collected, on an 
annual basis only, by the Savings Division of the American Bankers’ 
Association from about 140 scattered commercial banks for the period 
1940-47; similar rates'for 1949-51 were obtained from more than 800 


* It should be clear from the above that the concern here is with the “moneyness” of time deposits 
rather than with other aspects of the problem, such as the fact that time deposits provide resenyes for 
credit expansion or that it is difficult to separate the two types of deposits. These two points have 
used by Goldenweiser (in his recent American Monetary Policy) as arguments for including time deposits 
with the money supply. 

The fluidity of the concept of money has been increased by the war-time growth of liquid assets 
Possessing a high degree of convertibility and attaining annual-turnover rates similar—as a matter of 
fact, identical—to those of time deposits. Thus, Series E Savings bonds, after being held for at least 
sixty days, now are as easily and conveniently converted into cash аз time deposits. Furthermore, while 
legally a 30-day notice can be enforced against holders of savings accounts, Savings bonds are payable 
on demand. Obviously, all types of liquid assets possess some degree of “moneyness.” Marget’s argu- 
ment that all deposits have some rate of turnover—be it only once every 1,000 years—may be extended 
to include all types of liquid assets. 

7 U. 8. Government and interbank time deposits are always excluded from the term “time deposits” 
as used in this paper, = 
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commercial banks. These data сап be supplemented by time series for 
the turnover of deposits at mutual savings banks, savings and loan 
associations, and the Postal Savings System which can be computed for 
pefiods of various lengths. In addition, several special surveys con- 
ducted by the Savings Banks Trust Company help to shed additional 
light on the behavior of savings deposits. 


THE RELATIVE IMPORTANCE OF SAVINGS DEPOSITS IN 
COMMERCIAL BANK TIME DEPOSITS 


The justification for drawing limited inference from the behavior of 
savings deposits to the behavior of total time deposits in commercial 
banks lies in the fact that the share of the other types of time deposits 
in total time deposits is known to have been small and declining. The 
earliest available data on the proportion of savings deposits in time 
deposits are limited to national banks, fewer than half of which re- 
ported savings accounts separately for June 30, 1910. In these banks, 
savings deposits accounted for about 57 per cent of a derived total, 
“time deposits”; by June 1921, the percentage of savings deposits and 


open accounts combined (no separate figures being available) was ovet 


73 per cent, and by June 30, 1926, over 79 per cent. Time certificates of 
deposit, which in 1917 accounted for nearly 40 per cent of all time de- 
posits, in 1926 represented only 21 and in 1929 16.5 per cent. 

In subsequent years, the importance of certificates of deposit and of 
open accounts declined not only relatively, but also absolutely, 88 
shown in the following table for all member banks of the Federal Re- 
serve System: 


: Time i 
Candee | бла мшш» мом. Tre 
of deposit peat P 


eben EEG SITS лс СУ ete лы зз ы, а 


(In millions of dollars) 


December 31, 1928 9,810 1,805 1,089 12,794 
June 30, 1942 (latest ~ 
date ayailable) 10,357 566 749 11,073 


* Including small amounts in Christmas savings and similar accounts. 
Source: Board of Governors of the Federal Reserve System, Member Bank Call Reports. 


e cnm 


* Prior (о the Federal Reserve Act, national banking legislation did not specifically distinguish lu | 


tween time and demand deposits. Since the classification of deposi f the 
Curre: jud, posits reported to the Comptroller o! 
of the Cur bys ^ d 2. and t of the reporting banks (see Annual Report of the Сотр о 
is ОР: р. 7), because of in uem 
more than an EET complete coverage, the percentage for 1910 is 
After 1921, the classification “savings deposits" had become identical with the subtotal “time de- 


posits other than certificates” and was reported by all national banks having such deposits. 8 


All time deposits, used here, exclude State and municipal and Postal Savings deposits. 


> 
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By June 30, 1945 (the latest date for which such data are available 
for all national banks), the percentage that savings deposits represented 
of all time deposits of individuals, partnerships, and corporations at 
national banks had risen to 95.9 per cent. It was slightly lower—94.0 
per cent—for all commercial banks largely because of the inclusion of 
State trust companies. Г 

Although no national data are available for the postwar years, those 
for State commercial banks and trust companies in New York State 
indicate that the prewar decline of time deposits other than savings 
deposits is still continuing. Even though the accelerated growth of the 
relative importance of savings deposits during the war years was re- 
versed in the first postwar years, at the end of 1951 sawings deposits 
represented a higher percentage of total time deposits than at the end 
of 1940. This was true for banks located outside New York City (where 
the ratio in earlier years was close to the national average) as well as for 
those in New York City (where it was much lower than for banks out- 
side New York City because of the importance of trust department and 
other types of open accounts): 


Savings Depostis as а Percentage of Total Time Deposits of 
Individuals, Partnerships, and Corporations 


End of New York City Rest of New York State 
1940 41.7 95.4 
1945 65.0 98.1 
1951 47.1 97.1 


Source: New York State Banking Department. 
e 


Thus, the long-run decline of the relative importance of time deposits 
other than savings deposits is apparently still continuing. Outside New 
York City (and possibly outside Chicago and one or two other money 
centers) time deposits other than savings accounts are, however, al- 
teady reduced to such a small proportion of the total that their further 
decline is unlikely to have much effect on the velocity of total time 
deposits, 


TURNOVER RATES OF TIME AND SAVINGS DEPOSITS 
_ During the years immediately preceding World War II, time deposits 
In weekly reporting member banks outside New York City turned over - 
from 0.66 to 0.82 times a year (Annual averages; for monthly data, see 
Chart I). However, time deposits at commercial banks include some 
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deposits of business firms, local governments, and trust departments, 
which tend to be more active than savings deposits. Since the weekly: 
reporting banks include essentially money market banks with a rela- 
tively high proportion of such deposits, it is likely that the velocity of 


Cuarr I. Annual Rates of Turnover of Time Deposits* in Weekly Reporting 
Member Banks Outside New York City, and of Regular Deposits] in New York 
State Mutual Savings Banks. September 1934 through January 1939. 

ANNUAL RATE 


OF TURNOVER 
1.25) 


1,00 MEMBER BANKS 
OUTSIDE N.Y. CITY 


0.75 


М.Ү, STATE 
SAVINGS BANKS 


0,25 


1934 1935 1936 ГЕЙ 1938 1939 


* Total time deposits except interbank between September 1934 and January 1938; thereafter tim? 
deposits of individuals, corporations, eto., States and political subdivisions. See footnote 6, page 17 
t Including school savings accounts. 


Source: Board of Governors of Federal Reserve System, and Savings Banks Trust Co. 


time deposits at all commercial banks outside New York City 5449 
lower than the rates computed from figures reported by these money 
market banks. 

For the same reason, the velocity of time deposits in New York City 
(where, іп recent years, savings deposits may have accounted for 89 
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little as 50 per cent of time deposits) ranged between 1.65 and 1.99.° In 
comparison, demand deposits аб the weekly reporting member banks 
in New York City turned over about forty times as rapidly as time de- 
posits.!? 

ҒА detailed analysis of these data was made at the Board of Governors 
of the Federal Reserve System only for the single year 1936, when the 
annual rates of turnover ranged between 0.5 (Philadelphia District and 
New York District outside New York City) and 1.1 (San Francisco 
District). In New York City, time deposits were more active, but still 
turned over less than twice (1.4 times) a year as compared with 0.8 time 
for all 100 outside centers. The higher turnover rates in New York City 
were traced to relatively active time deposits of trust departments of 
some banks and in the San Francisco District to a large proportion of 
rather active local government funds. It was also suggested that report- 
ing (contrary to instructions) of renewals of maturing certificates of 
deposit as debits had likely resulted in overstating the velocity of time 
deposits, especially in the Boston and Minneapolis Districts (where 
certificates of deposit are more important than in other districts), The 
reporting of transfers of time deposits to demand deposits is likely to 
have had the same effect." 

The results of the ABA surveys are consistent with the rates based 
on Federal Reserve System studies. They show that annual rates of 
turnover of savings deposits at cooperating commercial ‘banks ranged 
in 1940-51 between 0.45 and 0.60; the rapid growth of savings de- 
posits which have more than doubled during this period may be re- 
garded as a sufficient explanation for the slight downward drift in the 
average rate of turnover over the twelve-year period. 


d * The rates of turnover of time depésits in weekly reporting New York City banks reflect primarily 
Activity of time accounts other than personal @avings accounts. Their monthly fluctuations during the 
years 1935-39 were large and djd not display the seasonal pattern shown by outside reporting centers or 
by savings banks, Increases in the rate of time deposit turnover of 50 or more per cent from one month 
to another were not infrequent. Such fluctuations most likely reflect unusual withdrawals from cor- 
Porate, municipal government, and trust department time deposits. 

- While the rate of turnover of time deposits in banks outside New York City: drifted downward dur- 
ing the period under review, the corresponding rate for New York banks rose. Ж: 

22 On the turnover of demand and of total deposits, see George Garvy, The Development of Bank 
Dobite and Clearings and Their Use in Economic Analysis (Washington, 1952), Ch. VII. 

ï Turnover rates for individual banks were studied in the Boston District only. Wide variations 
were found, partly reflecting differences in reporting and classification practices of banka. 

З A careful inspection of rates of turnover for individual banks in the ABA sample for 1940-47 
reveals that in most banks, these rates were within the range of 0.3 and 0.6, which may be considered 
harrow in view of the differences in the structure and mobility of population, banking facilities, account- 
me Practices, age of the savings departments, and other pertinent factors, 

While institutions with very large savings deposits generally have higher turnover rates than those 
which hold a smaller volume, there are exceptions. As a matter of fact, some banks with upward of 
$100,000,000 оғ Savings deposits ranked among those with the lowest turnover rates. 


е< 
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Annual Raie of Turnover of Savings Deposits Savings 


Departments іп Commercial Banks* 
* Year Year 
1940 0.53 1946 0.60 
1941 0.58 1947 0.56 
1942 0.54 1948 қу 
1943 0.49 1949 0.45 
1944 0.47 1950 0.48 
1945 0.49 1951 0.46 


* The survey for 1940-1947, conducted іп 1948, covered 140 banks; the survey for 1949-51, made in 
- 1952, included 808 banks. 

Source: American Bankers’ Association, 

Tn contrast, deposits of mutual savings banks have been turning over 
since the end of World War II, on the average, once every four years; 
rates of turnover of shares of savings and loan associations since 1945 
have been of the same magnitude (see Chart II). These velocity ratios 
are about half as large as those of savings deposits at commercial banks 
according to the ABA surveys. The principal reason for these differ- 
ences in turnover rates lies in the broader coverage of commercial bank 
time deposits, and in institutional differences between the several types 
of savings institutions. They are, however, not of sufficient magnitude 
to preclude drawing some inference as to the behavior of turnover rates 
of time deposits prior to 1934 from data for mutual savings banks. 
Such data can be computed for New York State institutions for more 
than half a century.” 

Since 1900, the annual rate of turnover of New York State mutual 
savings bank deposits has fluctuated within,the relatively narrow range 
of 0.24 and 0.34, with the exception of 1929 (when it was 0.40) and most 
war years (when it was somewhat lower). Dwring the twenties, it 
fluctuated at a higher average level thar during the first two decades 
of the century or the years since 1934. In recent years, the velocity of 
sayings depòsits in New York savings banks has been still considerably 
lower than in the twenties, although somewhat higher than during 
World War II years. It reflects clearly the speculative peak of 1929 
(when the rate of turnover rose to a level of one-third above that of the 
preceding years), the contractions in 1907 and 1937, and the uncertain- 
ties caused by the declarations of war in 1917 and 1941. The increase 


a a Ee, 


D 
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in 1950-51, traceable to the outbreak of hostilities in Korea, was rela- 
tively minor. 


CnanT II. Annual Rates of Turnover of Regular Deposits* in New Yerk 
State Mutual Savings Banks and of Private Share Capital in all U. 8. Savings 
and Loan Associations. Monthly 1945—51. 

ANNUAL RATE 


OF TURNOVER 
0.6 


NEW YORK 
SAVINGS BANKS 


sa ON 


71945 1946 1947 1948 1949 1950 1951 


* Regular deposits including school savings accounts. 

Source: For savings and loan association data, Home Loan Bank Board. For mutual savings bank 
data, Savings Banks Trust Co. $ 

Turnover rates for New York State savings banks suggest that an in- 

crease in the public’s desire to hold currency, resulting from an impair- 
ment of publie confidence or from other causes," is reflected in with- 
drawals of savings deposits as well as of demand deposits. The velocity 
of savings deposits, however, rises very little or not at all in periods of 

u For a detailed analysis based on investigations made by several New York savings banks, see 
Weldon Welfing, Savings Banking in New York State (Duke University Press, 1989), pp. 156 ff. Dr, 
Welfling remarks, however: “The stock market activity would not necessarily produce an increase in 
withdrawals and accounts closed, for an increase in loans on passbooks would serve the same purpose to 
the depositor.” 
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business expansion, Thus, the rate of turnover of deposits of New York 
savings banks did not increase in the late 1920’s until 1929, lagging 
behind the velocity of demand deposits at commercial banks outside 
Néw York City which rose nearly one-third during this period. Again, 
while velocity of demand deposits increased by one-third from the low 
point in 1949 to the peak in the spring of 1951, the rate of turnover of 
savings bank deposits (and of shares of savings and loan associations) 
rose very little, perhaps 10 per cent. As already noted, it declined dur- 
ing the expansion of business activity during World War II, when, 
however, the activity of demand deposits at weekly reporting banks 
also slowed down. 


Rates*óf Turnover of Deposits in New York State Savings Banks 


1900 .26 1920 .33 1940 .24 
1901 .26 1921 .82 1941 .27 
1902 .25 1922 .81 1942 .25 
1903 .26 1923 .38 1943 .22 
1904 .26 1924 .32 1944 .21 
1905 .27 1925 183 1945 .28 
1906 .29 1926 .32 1946 30 
1907 .81 1927 .32 1947 .28 
1908 .28 Їө28 .33 1948 .29 
19099, .25 1929 .40 1949 .27 
1910 .26 1930 .32 1950 .29 
1911 .26 1931 .84 1951 .29 
1912 .26 1932 .82 

1913 .26 1933 .82 

1914 .25 1934 .26 

1915 .24 1935 © .,26. 

1916 .22 1936 26 уй 

1917 .26 1937 .27 

1918 .26 1938 .25 

. .1919 


5 7.80 1939 .24 
Source: Savings Bank Trust Company. Chart Book. 
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Slight increases occurred in 1945 and 1950 and а rather sharp rise in the 
reconversion year 1946, 

The relative stability of the rate of turnover of New York State 
savings deposits over a period of more than fifty years is entirely com- 
patible with a declining trend in the rate of turnover of total time de- 
05118,18 which reflects the gradually diminishing importance of time 
deposits other than those evidenced by passbooks. There are two addi- 
tional reasons for presuming that, prior to the Banking Act of 1933, 
time deposits at commercial banks were more active than during the 
period covered by the Board’s study of weekly reporting member 
banks: j 

1. In the 1920’s, it was possible to make withdrawals from savings 
accounts in some banks without personal presentation of passbooks. 
Opinions differ regarding the extent to which holders of time deposits 
considered them practically equivalent to demand deposits and were 
actually able to draw checks against them. There is a voluminous 
literature on the subject, and views range from Schumpeter's assertion 
that “їп the twenties time and demand deposits were essentially the 
same kind of thing”!* to Currie’s reluctance to consider time deposits 
essentially different from holdings of Government securities. 

It is uncertain whether the possibility ef facile drawing against time 
deposit accounts was actually bf material effect on the yate of with- 
drawals. On the other hand, after 1933, the prohibition of the payment 
of interest on demand deposits and the introduction of service charges 
(and their subsequent increases) may have induced some small deposi- 
lors to make more active use of their savings accounts to cash checks 
and to purchase cashier's checks for making payments rather than to 
maintain separate but опегбив checking accounts. 

2. It is also frequently claimed that, before 1933, part of the time 
deposits in commercial banks actually represented misclassified de- 
mand deposits.” There are no direct quantitative data, and none are 
likely to become available, to resolve the conflicting views on the mean- 
ing of the growth of time deposits at commercial banks during {һе 


1% From the end of 1934 to 1939, when the turnover rates of the more homogeneous New York 
savings deposits fluctuated around a horizontal level, the velocity of time deposits at weekly reporting 
member banks outside New York City clearly showed a downward drift (Chart I). 

11 See Federal Reserve System Report of the Committee on Member Bank Reserves (Washington, 
1931), p. 7. 

| x А. Schumpeter, Business Cycles (New York, 1939), Vol. II, p. 856. 

19 Currie, op. cit., p. 14. 

20 See, for instance, B. M. Anderson's comments in The Chase Economic Bulletin (Nov. 8, 1926), 
р. 14 n. As already mentioned, this view was also held very strongly by Schumpeter. Woodlief "Thomas 
wrote: “In the 1920's, they [time deposits] could be easily withdrawn from many banks, and sometimes 
included funds that might otherwise have been in demand deposits.” (Banking Studies, p. 302). 
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twenties. The statistical evidence (supplementing personal knowledge 
of the existing practices) used by those who championed the opposing 
views, seems to have been limited essentially to data on the rate of 
growth of time deposits in different classes of banks and on the average 
size of time deposit accounts. The growth of time deposits, rather than 
any direct evidence of the classification of active deposits in the time 
category, gave rise to the views held by Anderson and others that dur- 
ing the twenties part of time deposits actually represented demand 
obligations.?! 

It is impossible to arrive at a firm conclusion as to the extent to which 
bankers might have misclassified demand deposits or given special 
drawing privileges to owners of time deposit accounts. In any ease, in 
recent years more rigid standards applied by bank examiners must have 
made the use of time deposits for current; transaction purposes ex- 
tremely difficult (even though increased reserve requirements for de- 
mand deposits have made misclassification of deposits more attractive 
to bankers). It seems reasonable to assume that the velocity of time 
deposits has been declining over time primarily because of the decreas- 
ing influence of the more active time certificates of deposit and open 
accounts, Since these types of accounts are now reduced to insignifi- 
cance except in large money~centers, it is likely that the velocity of 
time deposits will exhibit a stability similar to that of savings bank 
deposits, 


INTERPRETATION OF THE VELOCITY OF TIME DEPOSITS 


... The concept of velocity of deposits has а precise analytical meaning 
only if withdrawals can be equated with use for transactions purposes. 
It is, however, not апу more feasible to distinguish statistically in the 
case of time than in the case of demand deposits between payments for 
goods, services, or the acquisition of assets and the mere transfer of 
balances between different depositaries or accounts of the same holder. 


* Professor French, who has carefully examined the statistical data for national banks for 1922-28 
concluded that “the bank statistics show clearly that there could have been no appreciable amount of 
shifting of deposits from the demand to the time category on the part of large depositors in national 
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between savings institutions,” shifts among alternative forms of in- 
vestment or accounting entrances (such as expiration and renewal of 
certificates of deposit) at least as much as it represents payments for 
goods and services.” a 

To clarify the significance of the turnover rate of savings deposits, 
tyo aspects of account activity are of particular interest: (1) the sea- 
sonal pattern of turnover rates (which is nearly identical with that 
of withdrawals, since savings balances are very stable) and (2) the 
close month-to-month correspondence between the amount of new de- 
posits and withdrawals. 

Most New York savings banks pay interest at the end of June and 
December, while others credit interest quarterly. The segsonal pattern 
of turnover of deposits in New York savings banks (based on 1928-45) 
shows four peaks following these quarterly interest dates. The highest 
peak occurs in January when rates of turnover are about 50 per cent 
higher than in the two immediately preceding and two following 
months; the July peak is somewhat lower, while the April and October 
activity is only about 30 per cent higher than in the eight months of 
slow activity.” 

This pattern, which reflects that of withdrawals from savings ac- 
counts, may be explained largely in terms of interest payment dates.” 
Since the general practice is to credit interest only for amounts remain- 

2 This situation is indeed paralleled when demand deposits are shifted among several accounts 
(at the same or at different institutions) of the same individual or firm. Since, however, the transaction 
velocity of demand deposits is so much higher, the proportion of transfers to total transactions is ДЕУ 


to be considerably lower than in the case of time deposits. 

2 See Weldon Welfling, “Some Characteristics of Savings Deposits," American Economic Review, 
December 1940, pp. 748-58. 

u Quarterly data on withdrawals collected by the National Bureau of Economic Research (for one 
New York savings bank for 1880-1923 арӣ for five such banks for 1900-1947) consistently show peaks 
in the first and third quarters, corresponding. to! the January and July peaks of He more inclusive but 
shorter series used here. 

% The available data for tint» deposits of weekly reporting member bənks outside New York City 
show a broadly similar seasonal pattern, but the April and October peaks are much less pronounced than 
those in January and July. 

The seasonal pattern of withdrawals from savings and loan associations (based on 1944-51 data 
for the entire United States) shows two sharp peaks, in January and July (which are eyen sharper than 
for New York mutual savings banks) reflecting payment of dividends which in some cases is made auto- 
matically by check. The rates of turnover of the private share capital of U. S. savings and loan associa- 
tions decline gradually during each of the five months following the two months of peak withdrawals (if 
the small dip in February is removed by placing the index on the daily average base), 

In contrast, December is the peak month for rates of turnover of demand deposits at the weekly 
reporting member banks, although the predominance of December has been declining in the moving 
seasonal factors computed by the Federal Reserve Bank of New York. November is the second highest, 
month of demand deposit velocity. January is an average (or slightly above average) month, while July | 
has been consistently below average, reflecting the slowing down of business activity during the vaca- 
tion period, High rates of turnover of demand deposits in the two last months of the year reflect Christ- 
mas business, the movement of crops, and a multiplicity of business and personal end-of-year payments, 

*! Tax payment dates and seasonal need for extra cash (for payment of Christmas charge accounts 
in January and for vacation money in July) might be additional influences. 
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ing on deposit during the entire interest period, depositors who do not 
need funds urgently usually prefer to postpone withdrawals until the 
end of the interest period in order to earn the full interest. In particular 
investors who transfer their accounts from one institution to another 
would do it normally after interest for the last perod has been credited. 
Individuals, such as retired people, who use the income from savings 
accounts (and the principal as well) to meet current expenditures, are 
likely to make withdrawals after the (semiannual or quarterly) interest 
payment dates. Finally, since there is in New York State an upper 
limit (prior to 1951, 7,500 dollars) on individual savings bank accounts, 
interest on deposits which are at the maximum is normally withdrawn 
because no inferest is paid on the excess amounts. Although the propor- 
tion of the accounts which are at the legal limit is small, they account 
for a relatively large share of total deposits. 

While interest payment dates provide a sufficient explanation for the 
peaks in withdrawals (and in the rate of turnover) we must seek other 
reasons for the very close correspondence between monthly withdrawals 
and new deposits. Indeed, as Chart III shows, high monthly withdraw- 
als are usually associated with high levels of new deposits and low 
withdrawals with low deposits. 

There are several possible explanations for such a close correlation. 
A transfer of funds from one savings bank to another is reported both 
as a withdrawal and a deposit, A more important explanation seems to 
be, however, that a large proportion of account activity represents use 
of savings accounts for the purpose of cashing occasional checks; many 
people deposit checks to their savings accounts and withdraw part or 
the entire amount in cash.?? 

Thus it is likely that, in addition to interest payments, the January 
peak in account activity reflects cashing of dividend, bonus, and Christ- 
mas checks. In other quarterly months, similer deposits of dividend 
and other quarterly checks are made and a large proportion of the pro- 
ceeds is withdrawn in cash. 

The conelusion seems justified that velocity rates for savings deposits 


31 There аге among savings accounts а number of small but relatively active accounts which are 
apparently maintained primarily to cash checks rather than to build up а backlog of savings. Ап analy- 
sis of a sample of widely scattered savings departments of commercial banks and mutual savings banks, 
revealed that in 1937—40 most accounts which averaged more than one withdrawal a month accounted 
for a oe 5 pede cent, and in only exceptional cases for as much as 10 or more per cent, of total de- 
posit liability. accounts were responsible, however, for from 10 to 50 f 
withdrawals in individual institutions. с. tg Ee сенді 

A much smaller sample showed that in 1943-44 the proportion of active accounts declined, but that 
ee ыа: Eos considerably more active than during the years immediately preceding World War 

E ely, these changes reflect temporary wartime developments (see Savi t Com- 
pany, Chart Book, Chart A-26), eee - td 
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are not necessarily indicative of the rate at which balances are used for 
transaction purposes. Part of the activity of savings deposit accounts 
reflects shifts of investments among various media, including transfers 
from one bank to another. Rates of savings deposit turnover reflect the 
use of balances, but also the use of the product of such balances (inter- 
est) and—perhaps even more—the utilization of check cashing facilities 
which these balances provide.?* 

Inter-city differences in rates of time deposit turnover also reflect 
differences in the relative importance and activity of time deposits of 
business firms and of government units and in accounting practices. 
All this, together with the existence of a strong seasonal pattern, makes 
the interprefation of changes in the rates of time deposit turnover 
rather difficult. 

The main conclusions of this paper may be summarized as follows: 

1. Savings deposits at commercial banks turn over about once every 
two years, or only about one-fiftieth as rapidly as demand deposits. 

2. Because time deposits (other than interbank and U. S. Govern- 
ment) at commercial banks include several types of time accounts in 
addition to savings deposits, their velocity is higher than that of savings 
deposits alone, but still less than one. (The relatively higher turnover 
of time deposits in New York City member banks compared with 

"outside" banks reflects a higher percentage of time deposits other 
than savings deposits.) 

3. There has been a downward drift of the rate of time deposit turn- 
over in commercial banks outside New York City reflecting the relative 
increase of savings deposits in total time deposits of these institutions. 
The relative proportion of savings deposits in reported time deposits 
has been increasing continuously since the establishment of the Federal 
Reserve System, thus narrowing the;difference between rates of turn- 
over of commercial bank time deposits and saviags deposits. Currently, 
more than 94 per cent of time deposits at commercial banks outside 
New York City are savings deposits. It is, therefore, likely that changes 
in е velucity of savings bank accounts approximate closely those of 
cu and also time deposits at commercial banks outside New York 

y. 
i 4. Deposits at savings banks and shares of savings and loan associa- 
tions turn over about half as rapidly as savings deposits at commercial 
E С EE НЕР 


33 A survey made of several savings institutions durin, indi 
5 5 i the last war indicated that checks cashed ог 
реа селі (в portion of the cash being deposited) were between 1 and } the number of regular deposit 
transactions. (See Savings Banks Trust Company, Chart Book, Chart А-26.) In some cases, 
deposited. 
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banks. This difference in level of velocity (changes being similar) is 
attributable to institutional factors. Ў 

5. Тһе rate of turnover of savings deposits is more stable than that 
of demand deposits. This stability reflects partly the failure of saviags 
deposits to become more active when business expands. 

6. Тһе difference in the nature and use made of savings and demand 
deposits is reflected in significant differences in the seasonal patterns of 
their velocities, 

7. А relatively small proportion of savings accounts with nominal 
balances are used for cashing salary checks and for the temporary safe- 
keeping of funds causing them to influence disproportionately the ac- 
tivity of total deposits of a savings institution. D 

If these conclusions based on fragmentary and heterogeneous data 
surveyed have general validity, it would appear that those who are re- 
luctant to include time deposits in the money supply can find in them 
substantial arguments for their position. Indeed, if distinction between 
money and other types of liquid assets is approached not as a logical but 
as an empirical problem, it is apparent that average turnover rates of 
between 20 and 50 times a year are significantly different from those of 
once every two or four years. A significant part of the “activity” of 
savings accounts does not reflect withdrawals for making payments, 
but arises from the closing of accounts, from the cashing of salary and 
other checks (which are ultimately debited against demanti deposit ac- 
counts at commercial banks thus affecting velocity data based on such 
debits), and from the transfer of funds (including interest received) to 
other savings institutions. Analysis of the available statistical data on 
the rates of turnover of savings and time deposits shows, furthermore, 
that these rates do not exhibjt the same cyclical conformity or the same 
seasonal patterns as the rates of turnover of demand deposits. 

Considerably more justification could be found for including time 
deposits other than savings deposits with the money supply, although 
for recent years the amounts involved are very small in relation to de- 
mand deposits, е is 
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HE increased quantity and quality of national income statistics 

which have become available within the past few years make possi- 
ble a re-examination of significant relationships within the U. S. econ- 
omy. The present study represents an examination of one such relation- 
ship—labor and property shares in the national income. For this pur- 
pose the detailed accounts prepared by the National Income Division 
of the Department of Commerce have been utilized for the years 1929— 
1950. The conclusions for this period are then compared with the 
Kuznets data for the years 1919-1929. 

The present study is intended to serve two purposes: 

First, it will examine the common generalization that wages and 
salaries are a stable proportion of national income.! This generalization 
turns out to be valid only in relation to a particular set of national in- 
come concepts. When other concepts are employed it appears that 
labor shares in national income are not stable, but shift significantly 
from year to year and over a period of years. Section I will be devoted 
to an examination of alternative national income concepts which will 
differ substantially from those employed by the Department of Com- 
merce. Data on relative shares (changes in the distribution of income by 
type) will be presented and examined in Section II and in Section III. 

Second, this study will investigate changes in labor and property 
shares in the last three decades, but with particular reference to the 

\ decade of the Forties. That is, attention will be centered on the be- 
havior of labor and property shares under'inflationary conditions. The 
data show a general pattern: inereasés in economic activity are associ- 
ated with a reduction in labor shares of total income ; decreases in eco- 
nomic activity are associated with an increase in labor shares of total 
income; inflationary conditions appear to accentuate the decline in 
labor shares which is associated with increases in activity. The rationale 
for this behavior pattern will be considered in Section IV.2 


* The author is indebted to Joseph A. Pechman of the U. 8. ‘Treasury Department for criticisms of 
an earlier draft of this article. 

1 Вее, for example, Paul A. Samuelson, Economics (McGraw-Hill Book Co., 1951), p. 231. 

3 In the literature, the problem of inflation has generally been treated in Aggregative terms, as an 
"excess of demand over Supply," in terms of the rate of inflation and its measurement as а departure 
from assumed stability conditions. See, for example, Arthur Smithies, "The behavior of money national 
income under inflationary conditions," Quarterly Journal of Economics, November 1942, 113-138; 
Tjalling Koopmans, “The dynamics of inflation,” Review of Economics and Statistics, May 1942, 53-65. 
See also the literature on the inflationary gap, such as Walter A. Salant, “The inflationary gap,” Ameri- 
а Hohen Review, June 1942, 308-313; Milton Friedman, *Discussion of the inflationary gap," 
ibid., 814-20, 
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Before examining the data in detail it will be necessary to set forth 
the conceptual basis for the measurement of labor and property shares 
in total income, 

I е 

National income concepts have no independent validity ; they must 
be appropriate to the purposes at hand.’ The present purpose is the 
measurement of the relative magnitude of labor income and of property 
income over a period of time. Two basie concepts (and some variants 
thereof) will be employed. The first basic concept will be labeled the 
"primary" distribution of income. The second, the “claims” concept of 
income. 

The primary distribution of income will measure labor,and property 
shares derived from current economic activity. Therefore, income 
equals earnings paid or accrued for the use of property or personal serv- 
ices. This concept is similar to “earning power” measurements of in- 
come ;* and to the concept of producers’ іпсоте,5 

The second basic concept—income claims—attempts to measure 
labor and property shares in terms of relative economic power. Claims 
on goods and services reflect not only earnings and accruals from cur- 
rent output but also earnings from the purchase and sale of assets (net 
capital gains) and from income receipts, which are not derived from 
current production, The claims concept of income is, therefore, equal 
to the primary distribution of income plus net capital gairls plus trans- 
fer payments. This brings the concept closer to Haig's definition of in- 
come as “the money value of the net accretion to one's economie power 
between two points of time." 

Much less attention has been paid to structural problems—to the changes within the economy 
during inflationary periods, In one of the £v contributions to this subject Holzman points out that only 
two of the ten economists represented in the symposium on inflation in the August, 1949 Review of Eco- 
nomics and Statistics mentioned distributional adbects of inflation (Bee Franklyn D. Holzman, “Income 
determination in open inflation, "eReview of Economics and Statistics, May 1950, 150-58). Another con- 


tribution to structural relationships is Ralph Turvey, “Period analysis and inflation,” Economica, 
August 1949, 218-27, 

3 Simon Kuznets, “National Income,” Encyclopedia of the Social Sciences (New York, 1934), Vol, 
XI, p. 206. е * 

* For a discussion of the “earning power" concept of income see, Concentration and Composition of 
Tndividual Incomes, 1918-1937, TNEC Monograph No. 4, Washington, 1940 (by Adolph J, Goldenthal), 
pp. 9-11, 

* Tibor Barna, Redistribution of Incomes Through Public Finance in 1937 (Oxford, 1945), p. 17. 

"К, М. Haig, The Federal Income Tar (Columbia University Press, 1921), p. 7. (Cited by Roy 
Blough and W, W. Hewett, *Capital gains in income theory and taxation poliey," Studies in Income 
and Wealth, Vol. II (National Bureau of Economic Research, 1938), p. 197.) Haig would define “net 
accretion” to include consumer expenditures during the period. A complete application of the “economic 
Power” concept of income would require the inclusion of both realized and unrealized net capital gains, 
but no data are available on the latter. Also, it would be necessary to adjust income receipts and accruals 
in accordance with Personal taxes. No attempt has been made to deal with this adjustment on the data 
Presented here. However, an effort along these lines, based on Department of Commerce income con- 
cepts, has been recently attempted. See Edward F. Denison, “Distribution of national income,” Survey 
of Current Business, June 1952, 22-8, 
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Тһе remainder of this section will be devoted to examination of the 
conceptual basis of the primary distribution of income, and the be- 
havior of labor and property shares when measured on this basis. The 
claims concept of income will be examined in Section III. 


PRIMARY DISTRIBUTION OF INCOME 


“Тһе measurement of labor and property shares in accordance with 
the primary distribution of income is set forth in detail for the years 
1929-1950 in Appendix Table A. Тһтее concepts of total income have 
been employed. Labor income is identical in each of these three. Con- 
cepts II and III differ from Concept I in their treatment of property 
income. 2 

Labor income (Concepts I, IT, III) includes: 


1. The compensation of employees 


This embraces wages and salaries in the private, military and gov- 
` ernment civilian sectors, and the supplements to wages and salaries, 
including employer contributions to social insurance and to private 
pension funds, The compensation of corporate officers is treated here as 
labor income, although a part of such compensation should probably 
be classified as property income. That is, corporation managers may be 
in a pesition to “divert” corporate net income from stockholders to 
managers. ‘lhe amount of such diversion is likely to be small in relation 
to the total of property income and will probably exert no perceptible 
influence on relative shares." 


2. Income of unincorporated enterprise less inventory valuation adjust- 
ment, 


It is exceedingly difficult to determine whether this type of income is 
a labor or a property share. Entrepreneurial income is a mixture of 
payment for personal services (labor income) and return from property. 
Tn some instances, as with the income of professionals, it may be pre- 
dominantiy attributable to labor. In other instances, as in unincorpo- 
rated manufacturing enterprise, it may be predominantly a return to 
capital. Not only will the division of this type of income between labor 
and property vary from industry to industry, it will also vary within an 


7 For example, in 1939 the compensation of corporate officers was about 9 per cent of the total 
of wages and salaries in the corporate sector. In 1948 it was about 7.5 per cent (U. 8. Department of 
Commerce, National Income, 1961, p. 156). On the other hand, there has undoubtedly been a growth 
in pension plans, stock purchase plans and expense accounts available to executives. It is not possible. 
Mw the volume of this kind of income or determine whether it represents property income or 
labor income, 
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industry. Furthermore, there is no reason to assume that the labor- 
property relationship within the total of income from unincorporated 
enterprise will remain constant from year to year. 

The problem of classification of income of unincorporated enterprise 
becomes, then, the logically impossible feat of separating the insepara- 
ble parts of a whole. However, the sector is too large to be ignored; in 
1950 it accounted for 15 per cent of the total of national income. Chart 
I shows the movements in the totals of property income, labor income ` 
and income from unincorporated enterprise for the years 1929—1950. 
No clear pattern of relationship is established by the fluctuations in the 
three series. During the depression and recovery years the income of 
unincorporated enterprise generally moved in the same direction and 
to the same degree as property income. But in the war years its move- 
ments were comparable to those of employee compensation. 


Снавт I. Compensation of Employees, Property Income, Income of DES AT 
porated Enterprise, 1929—1950. 


Compensotion of Employees 


” Ё а Source- Dota from Department of Commerce, 


Bilions of Dollors 


3 Ww Notional Income, 1951 


Sf (See text for definitions): 


YEAR 


The industrial composition of income of unincorporated enterprise 
may provide a somewhat better basis for its classification as labor in- 
come or property income than its behavior pattern. Although the com- 
position of income of unincorporated enterprise varies from year to 
year, four categories generally comprise its greatest bulk: agriculture, 
retail trade, medical and other health services, and legal services. (In 
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1950 these four made up 72 per cent of the income of the sector.) Since 
these four categories, in turn, appear to be predominantly labor-type 
activities, it would appear most reasonable to classify the whole sector 
æ labor income. This would appear to be a better procedure than at- 
tempting to estimate its changing labor and property segments year 
by year.? 

The classification of income from unincorporated enterprise as labor 
income would therefore appear to be justified on the ground of its in- 
dustrial composition. The advantage of this classification is that it ob- 
literates the shifts between the self-employed and wage earners; that is, 
it eliminates changes in labor shares that would otherwise result from 
changes in the proportion of the labor force which is self-employed. The 
inescapable. disadvantage is that labor income, defined to include the 
income of unincorporated enterprise, will contain an uncertain and 
variable component of property income. 

Property income, Concept I, includes: 


1. Rental income of persons 


Although this type of income reflects some element of personal serv- 
ices provided by landlords to tenants, no breakdown of this portion is 
possible and the receipt as a whole is predominantly property income. 


2. Corporate profits after tax, without allowance for inventory valuation 
adjustment. 


This measurement of property income in the corporate sector raises 
two questions: 

a. In the consideration of income shares derived from current pro- 
duction it is appropriate to measure corporate profits after tax rather 
than corporate profits before tax. Although corporate profits before tax 
are, in some sense, available for the payment, of wage increases, once 
the wage bill is paid, corporate profits taxes are not available to either 
labor or property income recipients; the taxes may not be spent by the 
cozporation ог its stockholders. The measurement of corporate profits 
net of taxes has the further advantage that it puts all property income 
on a uniform basis with respect to taxation, that is, all property income 
is then measured net of business taxes but gross of personal taxes.’ 


T SARA Kuznets seems to be inclined toward the view that entrepreneurial income is predominantly labor 
income. See Simon Kuznets, National Income and Its Composition, 1919-1938, Vol. I (National Bureau 
of Economic Research, 1941), p. 82, 

9 The controversy concerning whether corporate profits taxes are reflected in product prices is not 
germane to the approach here, where income is measured in terms of shares paid or accrued. 
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However, for illustrative purposes, a measurement of property income · 
which includes corporate profits taxes is set forth in Concept II. The 
differences in behavior exhibited by the two concepts will be examined 
below. е 

b. Inventory valuation adjustment is appropriately excluded from 
the measurement of property shares. The Department of Commerce 
adjustment is for the purpose of arriving at the value of goods and 
services produced within the accounting period. But profits or losses 
on inventory should be included in а measurement of payments or ac- 
eruals to property income recipients from current activity. Such profits 
and losses are as much property income, and as much dependent on 
current operations, as profits from the production and sale of goods and 
services within the accounting period. 


8. Net interest 
This receipt is clearly and unambiguously property income. 


4. Net government interest 


Although excluded from national income, and treated, in effect, as à 
transfer payment in the Department of Commerce national income ac- 
counts, government interest is included here in the total of property 
income. Viewed from the standpoint of the recipient of government 
interest, this type of receipt is as much property income as*interest on 
industrial bonds. The recipient of interest is the owner of property; 
this property yields a return in the form of government interest pay- 
ments. Тһе inclusion of government interest is, therefore, appropriate 
when income flows are viewed from the standpoint of the recipient. It 
is equally appropriate to exelnde government interest when, as in the 
Department of Commerce concepts, income flows are viewed in relation 
to the aggregate of current output. 

The sum of items 1—4 represent property income, Concept I. Table 1 
shows a reconciliation between the concept of national income em- 
ployed by the Department of Commerce and the income concepts em- 
ployed here for the measurement of the relative shares of labor and 
property income paid or accrued to labor and property recipients. 

Two additional concepts of property income have been developed in 


10 In accordance with Department of Commerce procedure, government interest is measured ex- 
clusive of intra-governmental transfer, with the assumption that this amount may be appropriately 
imputed to individuals; that is, government interest receipts by financial institutions correspond to 
imputed service charges. (See National Income, 1951. p. 93.) 
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examining relative shares in accordance with the primary distribution 
of income. In Concept II corporate profits taxes have been added. This 
concept is developed primarily for purposes of comparison, and in ac- 
edrdance with the convention that corporate profits taxes should be 
reflected in the national income. 

In Concept III property income reflects the addition of capital con- 
sumption allowances (corporate profits taxes are excluded). This treat- 
ment has the advantage of establishing a parity between labor and 


TABLE I 


RECONCILIATION: NATIONAL INCOME (DEPARTMENT OF 
COMMERCE), THE PRIMARY DISTRIBUTION OF INCOME, 
THE CLAIMS CONCEPT OF INCOME, 1947 


(Millions 
of dollars) 
National Income, Department of Commerce..................... 198,688 
Deduct: A 
Inventory Valuation Adjustment, Corporate......... —5,757 


Inventory Valuation Adjustment, Unincorp. 
Corporate Profits Тахез.................. à 
Add: Government Interest. (2... 4,378 


The Primary’ Distribution of Income: 
Concept I 


CHANGES IN THE FUNCTIONAL DISTRIBUTION OF INCOME 199 . 


property shares, since labor income, of necessity, includes “labor con- 
sumption allowances," that is, the cost of maintaining labor's earning ' 
capacity. Capital consumption allowances, as estimated by the Depart- 
ment of Commerce, represent amounts which are generally available*to 
the property sector for purposes of maintaining capital equipment. 
Whether or not these allowances are adequate to maintain the capital 
"intact" is irrelevant for present purposes. The important point is that 
capital consumption allowances are available for expenditure by the 
property sector. 
II 


The behavior of labor and property shares in total income, where 
total income is based on the primary distribution of inconfe, is depicted 
in Chart II. The underlying data are shown in Appendix Table A. The 
three concepts which have been discussed are plotted in terms of rela- 
tives, that is, as labor's share of total income for each year 1929-1950. 


Cnanr II. Labor Shares in Total Income: the Primary Distribution 
of Income, 1929—1950. 
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FLUCTUATIONS IN THE PRIMARY DISTRIBUTION OF INCOME 


Labor Shares, measured іп accordance with Concept I, rise sharply 
аз economie activity declines from 1929 to 1932. With the progress of 
recovery after 1933, labor shares decline at the outset and then remain 
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reasonably stable for the remainder of the decade at from 80 to 83 per 
cent of total income. This is а substantially higher share than in 1929 
when labor shares were 75 per cent of total income. Since labor-prop- 
erty shares іп 1929 were comparable to labor-property shares in most 
of the years of the Twenties, it follows that labor’s share of total in- 
come in the late Thirties was substantially higher than in the Twenties. 

Тһе war period exhibits a striking increase in the share of total in- 
come going to labor, in marked contrast with the generally stable re- 
lationship from 1933 to 1941. The war economy distributed a substan- 
tially higher proportion of total income to labor than in the late Thirties. 
In 1945 labor's share of total income was 88 per cent. This was 7 per- 
centage points more than in 1941 and 13 percentage points more than 
in 1929. 

Тһе postwar period, marked by a decline in real output and an in- 
crease in prices, is associated with a decline in labor's share of total 
income. The downward movement carries labor shares almost but not 
quite back to the levels of 1933-1941. Interestingly enough, the 1949 
recession is marked by a slight inerease in labor shares—consistent with 
the slight increase jn the recession of 1938. Also there is a slight down- 
turn in labor shares in 1950, notable not so much for the magnitude of 
change, which is insignificant, but for the direction of change. From 
1941 to 1942 the mobilization program produced a sharp increase in 
labor's sharé of total income. This seems not to have happened in the 
1950 mobilization. 

The behavior of labor shares in total income, where income includes 
corporate profits taxes, is shown in Concept II, Chart II. The year to 
year movements are comparable to those of Concept I, where corporate 
profits taxes are excluded, except for the years 1939 to 1941 when согро- 
rate profits taxes rose sharply. However, the inter-decade position of 
labor income is different when Concept II is eniployed. In World War 
II labor shares of total income (including corporate profits taxes) now 
appear to be no higher than in the late Thirties, and, in the post-war 
period, substantially lower than in the late Thirties, As has been argued 
above, it would not be appropriate to attach too much importance to 
conclusions based on a concept of income which includes corporate 
profits taxes, since these taxes should not properly be viewed as income 
paid or accrued from current activity." 


1 А recent study of wage shares in the national income of the United Kingdom shows a very dif- 
ferent behavior pattern than for the United States. The war years, for example, show low rather than 
high ratios of wage income to the total. (See E, H. Phelps Brown and P. E. Hart, "The Share of Wages 
in National Income," Economic Journal, June 1952, 253-77, especially p. 265.) However, the income 
concepts employed by Messrs. Brown and Hart are not wholly comparable with those utilized here. 
Profits, for example, are measured gross of profits taxes, 
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In Concept III, Chart II, capital consumption allowances are in- 
cluded in property income, and a sharply different behavior pattern 
emerges than when capital consumption allowances are neglected. 
Property income shares are not only considerably larger but now shew 
much greater stability through the years of most serious depression. 
In 1932 capital consumption allowances are larger than all other prop- 
erty income. 

The inclusion of capital consumption allowances further accentuates 
the relative improvement in labor's share during World War II. The 
shrinkage in labor shares in the postwar period is similar in magnitude 
to that exhibited by Concept I. However, Concept III shows that 
labor's share remains higher in the postwar period than in the Thirties. 
This is to be attributed to the fact that capital consumption allowances 
are smaller in relation to other income in the postwar period than dur- 
ing the Thirties. 

Perhaps the most important conclusion which emerges from the data 
shown in Chart II is that, employing either Concept I or Concept III, 
labor shares in total income rose sharply in World War II, receded 
during the postwar period, but remained somewhat higher than in the 
late Thirties. The economy appears to have been distributing а larger 
portion of its total income to labor over the past twenty years, although 
the postwar inflation has offset a part of this tendency. 


PRIMARY DISTRIBUTION OF INCOME IN THE CORPORATE SECTOR 


Department of Commerce data permit isolation and separate exami- 
nation of labor and property shares in the corporate sector. The be- 
havior of labor shares in accordance with varying concepts of income is 
set forth in Chart III; the tlata are shown in Appendix Table B. In 
general, Concepts I, IT, and Ш art similar in definition to those utilized 
in Chart II. " д 

Labor income, for all three concepts, is uniformly taken to include the 
compensation of employees of corporate enterprise. This compensation 
includes wages and salaries, employer contribution for social insurance 
and for private pension funds, and ^other labor income." Property in- 
come in Concept I includes corporate profits after tax and net interest 
paid by corporations. For Concept II, property income also includes 
Corporate profits taxes. For Concept III property income excludes 
corporate profits taxes, but includes corporate depreciation, for which 
data are available since 1933.2 


| Corporate depreciation has been used because it is not possible to isolate capital consumption 
allowances for corporations in the Department of Commerce accounts. 
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Ав Chart III shows, fluctuations in labor shares in the corporate 
sector are considerably sharper than the fluctuations in labor shares in 
the economy as а whole. For example, the 1937—38 recession and the 
1949 recession are considerably more pronounced. And, of course, the 
differences in behavior between Concept I and Concept II (corporate 
profits taxes included) are considerably sharper than for the economy 


Снлвт ПІ. Labor Shares in Total Income, Corporate Sector, the 
Primary Distribution of Income, 1929—1950. 


100 


90 


80 


x 
о 


Concept III 


o 
o 


У Source- Data from Department of Commerce 


a 
o 


National Income, 1951 


Percent of Total Income 


as а whole. The inclusion of corporate depreciation (Concept III) seems 
to have an effect similar to the inclusion of consümption allowances for 
the economy as a whole. 

The conclusions from Chart III are similar to those which may be 
derived from Chart II. Labor shares in the corporate sector were higher 
in the late Thirties than in 1929, and higher in World War II than in 
the late Thirties, 

Postwar changes are more difficult to interpret. Where income is 
based on Concept I it would appear that the shrinkage in labor shares 

It would be appropriate to include net rents in property income for the corporate sector-rents and 
royalties received by corporations less net rent paid on business property. Unfortunately, complete 
data on this type of income are not available for all years. However, in dollar amounts, and in its effect 


on relative shares, this omission is unimportant. For example, in 1942, net rents of corporations amounted 
to —$8 million. (U. S. Treasury Department, Statistics of Income for 1942, Part 2, pp. 328-29.) 


— № 
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was sufficiently pronounced to bring this sector back to about the same 
relative position that it occupied in the semi-prosperous years of the 
Thirties. However, where corporate depreciation is included (Concept 
III) the conclusion is comparable with that observed for the economy 
as а whole. The shrinkage in labor's relative position in the postwar 
years still left this sector in a slightly improved status as compared 
with the Thirties. 


III 


The measurements of labor and property shares in total income, as 
set forth in Part II, are based on the primary distribution of income, 
defined as payments or accruals from current economic activity. It is 
also useful to explore the behavior of relative shares when income is 
measured on the basis of claims. This requires attention to the appropri- 
ate classification of net capital gains and of transfer payments, 


RELATIVE SHARES ON THE BASIS OF CLAIMS 


Capital gains are rather obviously property income and have been 
classified as such here. Data on realized capital gains are taken from 
the Seltzer study. Capital gains have been included in property in- 
come at net; that is, losses have been deducted from gains, and where, 
a8 in depression years, there are aggregate net losses, thege are added 
algebraically to property income. Net gains realized by both individuals 
апа corporations are included. 

Transfer payments are classified as labor income. Again, it should 
be recalled that income is viewed here from the standpoint of the re- 
cipients, There can be little doubt that the recipients of labor income 
are, by and large, the recipients of transfer payments. In 1950, for 
example, $6,125 million of the $15,082 million of transfer payments 
represent benefits from social insurance funds. Military and veterans 
pensions and benefits, generally related to services previously per- 
formed, amounted to an additional $4,284 millions, E 

In combining transfer payments with labor income as defined in ас- i 
cordance with the primary distribution of income (Part II), it is neces- 
sary to deduct employer and employee contributions for social insur- 
ance and employer contributions for private pension funds from the 
total of labor income. This will avoid counting, as labor income, both 
employer and employee contributions to social insurance and pension 
funds and employee benefits from social insurance and pension funds. 
p oan Ж лу RS ede ier P PET TT. 


? Lawrence Н. Seltzer, The Nature and Taz Treatment of Capital Gains and Losses (National Bureau 
of Economie Research, 1951), pp. 367, 531. 
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This deduction has been made in Chart IV, and in Appendix Table C, 
which sets forth the data on which Chart IV is based." 

Chart IV shows labor’s share in total income in accordance with the 
claims concept of income; transfer payments are included in labor in- 
come and net capital gains in property income. Concept III differs 
from Concept I in that capital consumption allowances are included in 
the property income segment. 

Сндвт IV. Labor Shares іп Total Income: the Claims Concept of 
Income, 1929-1950. 
100 
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А comparison of Concept I, Chart II, with Concept I, Chart IV 
shows that the measurement of income on a claims basis produces re- 
sults which are markedly different than the measurement of income on 
the basis of primary distribution. First, property income falls more 
sharply іп, the depression because of the impact of capital losses. Sec- 
ond, the wartime improvement in labor's relative position is still sub- 
stantial, when compared with the Thirties, but is by no means as strik- 
ing as when net capital gains and transfer payments are omitted. 
Third, labor's relative position in the postwar period appears to be al- 
most identical with its relative position in the late Thirties. This is à 
reflection of the fact that transfer payments are less significant in rais- 


м Employee benefits from social insurance funds are included in transfer payments, but benefits 
from private pension funds are not included, and cannot be isolated in the national income accounts. 
The procedure which has been adopted here will, therefore, tend to understate labor income to the ex- 
tent of current benefits from private pension funds, 


| 
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ing the labor share of total income in the postwar period share than are 
capital gains in raising the property share. 

When capital consumption allowances are included (Concept III), it 
will be observed that the effect of the net gains and transfer payments is 
somewhat diluted. The behavior of Concept III in Chart IV is very 
similar to the behavior of Concept III in Chart II. The depression 
shrinkage in property incomes occasioned by capital losses is, of course, 
reflected in Chart IV, but the wartime improvement in labor’s relative 
share appears to be less striking when net gains and transfer payments 
are included. For both the primary distribution of income and the 
claims concept of income, the inclusion of capital consumption allow- 
ances in property income tends to accentuate the improvement in 
labor’s relative position over the whole period. 

In conclusion, it would appear that the major differences between 
the measurement of labor and property shares in terms of the primary 
distribution of income and in terms of the claims concept of income 
are: 1) the claims concept of income shows labor’s relative improve- 
ment to be more modest during World War II than does the primary 
distribution of income, and, 2) the claims concept shows labor's postwar 
relative position to be almost identical with its relative position in the 
late Thirties, while the primary distribution shows labor’s relative po- 
sition to be slightly improved in the postwar period. meee 


LABOR AND PROPERTY SHARES IN THE TWENTIES 


Data for the decade of the Twenties, comparable in refinement to the 
Department of Commerce accounts for 1929-50, are not available. 
However, the Kuznets estimates provide an adequate basis for broad 
comparison. e 

Chart V, for the years 1919—2929, is based on Kuznets data, with 
concepts generally similar to those described above in terms of the 
primary distribution of income." As the chart indicates, from 1921 to 
1929 there was remarkable stability in the labor-property division of 
total income. s i 

Data are not available for the estimation of labor and property 
shares in the Twenties in accordance with the claims concept of in- 
come; the necessary information on transfer payments is lacking. How- 


‚ For a detailed discussion of these concepts see Simon Kuznets, National Income and Its Composi- 
tion, 1919-1938, Vol. Т (National Bureau of Economio Research, 1941), pp. 3-60, 215-65. 

The concept of “total income” as used in Appendix Table D and Chart V differs from Kuznets’ 
national income by the exclusion of government net savings. On the other hand, “total income” as used 
here differs from the primary distribution of income, as set forth in Chart I, principally in the exclusion 
of imputed rent on owned homes and food produced and consumed on farms. 
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ever, the Seltzer data on capital gains for individuals for these years 
would indicate that the stability exhibited in the labor-property rela- 
tionship would be substantially modified by the addition of net gains 
to*the property segment.!5 For example, in the years 1919-1923 net 
capital gains for individuals aggregate to an insignificant $30 million. 
But for the years 1924-1929 they aggregate to $15.8 billion. Total 
property income for these years amounted to $96.9 billion. For the 
single year 1928, in which net gains were most significant, their inclu- 


CnanT V. Labor Shares іп Total Income, 1919-1929. 
100 
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Source: Simon Kuznets, Nalionol Income 8 Its Composition, 1919-1938 
Vol, Nationol Bureau of Economic Research, NewYorkJ94I 
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sion in property income would raise the relative share of property in- 
come in the total by about 20 per cent (four percentage points). 

If it may be assumed that transfer payments during the years 1919- 
1929 were minimal, it is evident that a claims concept of income for the 
Twenties, in contrast with a primary distribution concept, would show 
a downward drift in labor shares beginning in the year 1924, and con- 
tinuing through 1929. 


IV 


Before summarizing the general findings which emerge from this in- 
vestigation it may be useful to explore briefly the relationship between 


15 See Seltzer, op. cit., p. 367. Net gains of corporations are not available. 


CHANGES IN THE FUNCTIONAL DISTRIBUTION OF INCOME 207 


changes in labor and property shares of total income and resulting, or 
corresponding, changes in the distribution of personal income. 


RELATIVE SHARES AND THE DISTRIBUTION OF INCOME Ф 


It is not possible to move directly from the measurement of distribu- 
tion of income by type to the distribution of income by size class be- 
cause many persons receive income from several sources. Nevertheless, 
the total income of upper income groups is composed more largely of 
property income than is the total income of lower income groups, and 
this fact establishes a rough linkage between changes in labor and. prop- 
erty shares and changes in concentration in the distribution of income. 
On this basis it could be expected that an increase in property shares 


“іп total income would tend to increase concentration in the distribution 


of income. It could be expected that an increase in labor shares of total 
income would tend to reduce concentration in the distribution of in- 
come. 

However, as Pechman has pointed out, a decrease in the ratio of 
wages to total income can be produced in two ways: first, since wages 
are a smaller proportion of income at higher income levels, the ratio of 
wages to total income might decrease as income recipients are shifted 
upward in the distribution; second, an igcreased concentration of in- 
come might reduce the share of total income going to wages.!7 Therefore, 
only when both total income and labor's share are concurrently in- 
creasing can it be certain that concentration is reduced; similarly, only 
when both total income and labor's share are falling can it be certain 
that concentration is increased. 

It follows that, strictly speaking, only the World War II case shows 
an irrefutable instance in which the concentration of income is reduced. 
In all other instances in the years1919—1950 where labor shares rise or 
fall, total income is móving in the opposite direction, 

Nevertheless, the more generalized and less rigid hypothesis that an 
increase in labor shares is prima facie indication of reduced concentra- 
tion seems to be generally borne out by the conclusions that‘have been 
teached on the basis of other methods of analysis. On the basis of his 
hypothesis it would be concluded that, for both the primary distribu- 
tion of income and the claims concept of income, the distribution of 
income was less concentrated in the Thirties than in the Twenties and 
less concentrated during World War II than in the Thirties. Concen- 
tration increased in the 1946-1950 period as compared with World War 


_ PFrank A. Hanns, Joseph A. Pechman, Sidney M. Lerner, Analysis of Wisconsin Income (Na- 
tional Bureau of Economic Research, 1948), pp. 69-75. 
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II, but the resulting degree of concentration was slightly less than in 
the late Thirties. 

The Kuznets study of upper income group shares indicates that con- 
centration decreased from 1939 to 1945 and increased somewhat after 
1945.18 The finding here that income distribution was less concentrated 
jn World War II than during the middle Thirties is also in line with 
conclusions drawn from the surveys of family income for the years 
1935-86, 1941, 1942 and 1945.19 

On the other hand, the conclusion indicated in Chart II is that the 
1946-1948 inflation tended to inerease property shares and, therefore, 
in all probability, tended to increase concentration in the personal dis- 
tribution of income. The survey of consumer finances conducted by the 
Board of Governors of the Federal Reserve System indicates that con- 
centration increased from 1946 to 1947, but was reduced in the three 
succeeding years." This is in conflict with the Chart II data, which 
shows concentration increasing continuously from 1946 through 1948, 
with а very modest reduction in concentration in 1949, and an increase 
again in 1950. There appears to be no way to reconcile these conflicting 
conclusions. 

The finding in Chart II that labor shares tend to move against the 
business cycle, that is, increase during declines in economie activity and 
decrease as economic activity increases, would tend to indicate that the 
primary distribution of income is more concentrated during expansions 
than during contractions. The Kuznets data on concentration as re- 
vealed by changes in the shares of the upper one per cent of income 
recipients (where property income dominates) would appear to support 
this conclusion. On the other hand, Mendershausen found that in 1933 
income was somewhat more concentrated than in 1929.? However, 
these findings were based on examination of the top 30 per cent of in- 
come recipients. Mendershausen found that the'top one per cent, where 
property income is wholly predominant, showed a reduction in concen- 
tration between 1929 and 1933. 

Tt may be tentatively concluded that an increase in labor shares is 


18 Simon Kuznets, Shares of Upper Income Groups in Income and Savings (National Bureau of Eco- 
nomic Research, 1950), pp. 3-4, 

Morris A. Copeland, “The social and economic determinants of the distribution of income in the 
United States,” American Economic Review, March 1947, 57. Also, for а comparison of 1929 and 1937 
see Julius Wyler, “The share of capital in national income,” Social Research, November 1943, 436-54. 

20 Federal Reserve Bulletin, August 1951, pp. 929. 

4 Shares of Upper Income Groups in Income and Savings, p. 83 (Panel A). 

? Horst Mendershausen, Changes in Income Distribution During the Great Depression (National 
Bureau of Economic Research, 1946), pp. 68-80. 
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generally associated with an inerease in equality in the distribution of 
income. However, the evidence does not suggest that this is an exact 
relationship which holds for moderate year to year changes in labor an 
property shares. 2 


THE GENERALIZED BEHAVIOR OF LABOR AND PROPERTY SHARES 


Тһе relationships shown in Charts II, III and IV reveal а general 
pattern of behavior. As the level of economic activity increases, labor 
shares tend to decline. As the level of economie activity decreases, 
labor shares tend to increase. The only marked exception to this pat- 
tern is the case of World War II, when, as economic activity increased, 
labor shares also increased. e 

It would appear that there are a number of factors which contribute 
to this inverse relationship between labor shares and changes in eco- 
nomic activity. 

As economic activity increases, property incomes can be expected to 
gain in relation to labor incomes by what may be called the capacity 
effect. As the output of firms increases beyond the break-even point, 
there is a concomitant increase in the volume of profits. The traditional 
spreading of overhead, with correspondingly lower unit labor costs, 
produces a larger return to property pernit of output as output in- 
creases. The growth in the relative shares of property income recipients 
is certainly not without limit; this limit is associated with & dispropor- 
tionate increase in marginal costs. 

Where increases in economic activity are accompanied by inflationary 
conditions two additional factors will tend to increase the share of 
property income in relation to labor income. These may be termed the 
lag effect and the compounding effect. For present purposes the term in- 
flation will be used to describe a‘condition in which demand for final 
product is increasing With the prices of goods and services generally 
rising, but with some prices rising more rapidly than others. That is, 
inflation is both an aggregative and a differential phenomenon. 

The lag effect, which tends to increase property shares, is &’согоНату 
of the differential impact of price rises. A rising rate of effective de- 
mand confronts an output of goods and services which is increasing less 
rapidly than the increase in demand. The market for final product re- 
ceives the full brunt of the higher demand; the increased demand for 
labor is secondary. The seller of commodities becomes the immediate 
beneficiary of the rising prices. Entrepreneurial shares increase and the 


? For this point I am indebted to А. M. Mclsaac. 
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increase raises the shares of property income in the total.” The lag 
effect is not independent of the capacity effect. Increases in price for 
final product will raise the break-even point and broaden the range of 
profitable operation. 

Under inflationary conditions, a shift in relative shares of total in- 
come in favor of property income recipients will also result from what 
may be called the compounding effect. That which is compounded is 
savings, and the compounding occurs through its reinvestment.» Actu- 
ally, this effect operates independently of price rises during any period 
їп which operating losses and capital losses of firms are at à minimum. 
However, such a situation is characteristic of an inflationary period. 
Further, as entrepreneurial income expands, relatively, in an inflation- 
ary period, the tendencies toward compounding will be reinforced, if it 
may be assumed that the marginal propensity to consume out of en- 
trepreneurial income is smaller than out of labor income. As long as 

inflationary conditions continue, with operating and capital losses at a 
` minimum, those who have enjoyed the enhanced returns may further 
improve their economie position by earning income on income. The 


check оп the compounding effect, is, of course, the appearance of oper- | 


ating and capital losses. А reduced rate of return on investment will 
limit its operation but not offset it. 

The. importance of each of these factors in the relative growth of 
property income under inflationary conditions will vary with the rate 
and duration of the inflation. Presumably, lag effect will be most pro- 
nounced when the rate of price increases is most rapid. Compounding 
effect, on the other hand, is likely to be of greatest importance when in- 
flationary conditions have been sustained for long periods of time. The 
capacity effect will depend on the levelfrom which increases in eco- 
nomic activity proceed and will be ‘most important when increases in 
output are initiated from a point where many tirms are operating close 
{о break-even. 

The operation of three factors—lag, compounding and capacity ef- 
fect—wili alter labor and property shares in accordance with changes 
in the level of economie activity and the level of prices. Lag and ca- 
pacity effect, in particular, may be expected to be important in year to 
year fluctuations. 

These three factors are independent of inter-industry shifts which 


% The tendency of inflation to increase entrepreneurial shares has often been noted, but seldom 


explored in the literature. See Koopmans, op. cit.; and, of course, J. M. Ke; Ho the War 
(Harcourt, 1940), p. 6. i 3 io herd 


5 Gordon Hayes, Spending, Saving, and Employment (Knopf, 1945), pp. 35-42, 
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may alter labor and property shares in total income, particularly over a 
period of years. Most writers in the field have stressed the importance 
of these shifts in altering labor shares. In examining the upward move- 
ment in relative labor income between 1919 and 1938 Kuznets found 
that relative increases in employee compensation were traceable almost 
wholly to production shifts from industries in which the ratio of wages 
and salaries to property income was low to industries in which the ratio 
of wages and salaries to property income was high. Kuznets did not find 
a measurable tendency for particular industries to vary their ratio of 
wages and salaries to property income over this whole period.” The 
Department of Commerce has also stressed the importance of inter- 
industry shifts in the explanation of the larger employee*shares іп 1950 
as compared with 1929.27 

Unfortunately, it is not possible to disentangle statistically the year 
to year effects of the three factors noted above from the effects of inter- | 
industry shifts оуег'а period of years on labor and property shares in | 
total income. 

Finally, it should be noted that the general conclusion which emerges 
from the examination of labor and property shares in accordance with 
the concepts developed here is that they are by no means stable, either 
from year to year or over a period of*years. This conclusion differs 
rather sharply from that reached by analysts in the Department of 
Commerce who have stressed the stability in income shares, particu- 
larly in the employee compensation share of national income.?? The 
explanation for these differing conclusions lies in the concepts which 
are employed. The Department of Commerce analysis of employee 
compensation is based on a concept of income which includes corporate 
profits taxes and inventory valuation adjustment. Both of these are 
excluded from Concepts Тапа ШІ as developed here; the latter Concept 
also includes capital gains and transfer payments. The importance of 
the conceptual basis in the measurement of relative shares is well illus- 
trated in the analysis of the corporate sector (Chart IIT and Appendix 
Table B). In 1929 the addition of corporate profits taxes to total income 
affects labor's share (Concept I) by only 2.4 percentage points. In 1950 
this addition affects labor's share by 11.1 percentage points. 


(Continued оп page 218) 


5% Kuznets, National Income and Its Composition, pp. 241-50. 

* Denison, “Distribution of National Income,” pp. 16-23. 

33 Ibid.; Lawrence Оголе, “Labor income in the postwar period,” Survey of Current Business, May 
1952, 7-13, 
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T TABLE А 
LABOR AND PROPERTY SHARES: THE PRIMARY DISTRIBUTION OF INCOME 
1929-1950 
(money figures in millions of dollars) 
1920 1930 1931 1932 
5,811 4,786 3,620 2,508 
8,420 2,455 —1,283 —3,424 
8. Net Interest... 6,541 6,176 5,938 5,430 
4. Govern't Interest. оз ой 104 ты! 
5, Total Property (СопсерЬ1)..... 21,755 14,381 9,959 5,655 
13,927 10,963 8,24 4,921 
142 155 611 295 
13,785 10,208 7,603 4,626 
50,780 40,515 30,470 30,826 29,330 
10. Total Labor (Concept I). 64,571 56,723 47,073 35,452 35,002 
11, Total Property (Line 5) 21,755 14,381 9,359 5,06 
12, Total Income (Concept I). . 80,320 71,104 56,432 1,107 42,898 
Ratios: Concept I 
13. Total Labor (Line 10). . 74.89 79.8 83.4 86.2 
14. Total Property (Line 11). 25.2 20.2 10.6 13.8 
15, Total Property (Line 1) . 21,755 14,381 9,359 5,055 
16, Add: Corp. Profits Taxes, 1,398 848 500 382 
17, Total Property (Concept IT) 23,153 15,229 9,859 6,037 
18, Total Labor (Line 10). 64,571 56,723 47,073 35,452 35,062 
19. Total Income (Concept . 87,724 71,952 50,932 41,480 43,422 
Ratios: Concept II 
20. Total Labor (Line 18).... 73.6 78.8 82.7 85.4 
21. Total Property (Line 17). 20.4 21.2 17.3 > 14.6 
22. Total Property (Line 11)... 21,755 14,981 9,3599 5,655 
23, Add: Cap. Cons, Allowances. 8,816 8,747 8,312 7,66%. 
24. Total Property (Concept Ш. 30,571 23,128 17,671 13,318 15,081 
25, Total Labor (Line 10)... . 64,571 56,723 47,073 35,452 35,062 
26, Total Income (Concept III)..... 95,142 79,851 64,744 48,770 50,143 
Ratios: Concept ІЙ 
27. Total Labor (Line 25)... 67.9 71.0 72.7 72.7 
28. Total Property Line 24). . 321 29.0 213 2:3 


Source: Department of Commerce, National Income, 1951, 
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1942 


1943 194 


1945 1946 


1,205 
13,887 


11,282 

—166 
11,448 
47,820 
50,268 
13,887 
78,155 


81.0 
19.0 


13,887 

1,402 
15,949 
59,208 
74,617 


79.4 
20.6 


13,887 

8,101 
21,088 
59,268 
81,256 


1,291 1,289 
15,462 19,110 


12,000 16,504 

—52 -6 
12,712 17,48 
51,786 64,280 
64,498 81,428 
15,462 19,10 
79,960 100,538 


80.7 81.0 
19.3 19.0 


15,462 19,10 

2,88 7,846 
18,40 26,956 
64,408 81,428 
82,838 108,384 


77.9 75.1 
22.1 24.9 


15,402 19,110 

8,40 9,294 
23,902 28,404 
04,408 81,428 
88,400 109,832 


73.0 74.1 
27.0 25.9 


5,395 
9,433 
3,804 
1,517 
20,239 


23,041 
—372 
23,413 
84,805 
108,308 
20,239 
128,547 


84.3 
15.7 


20,239 
11,605 
31,904 
108,308 
140,212 


77.2 
22.8 


20,239 
9,981 
30,220 
108,308 
138,528 


78.2 
21.8 


6,100 6,495 
10,646 10,808 
3,355 3,137 
2,40 2,803 
22,250 23,243 


26,731 28,997 
—154 -70 
26,885 29,067 
109,212 121,163 
136,097 150,230 
22,250 23,243 
158,347 173,473 


85.9 86.6 
14.1 13.4 


22,250 23,243 
14,406 13,525 
36,656 36,768 
136,097 150,230 
172,753 180,998 


78.8 80.3 
2.3 197 
. 
22,250 23,248 
10,680 11,887 
32,0900 25,100 
136,097 150,230 
169,027 185,360 


80.5 81.0 
19.5 19.0 


6,256 6,620 
8,502 19,881 
3,000 2,022 
3,0 4,432 
21,430 27,855 


81,247 35,375 

—13 -1,819 
31,360 37,194 
123,020 117,008 
154,386 154,202 
21,430 27,855 
175,816 182,147 


87.8 2 84.7 
12.2 15.3 


21,430 27,855 
11,215 9,583 
32,645 37,438 
154,386 154,202 
187,031 191,730 


82.5 80.5 
17.5 19.5 


21,430 27,855 
12,410 12,163 
33,840 40,018 
154,386 154,202 
188,226 194,310 


82.0 79.4 
1.0 206 


33,530 
11,940 
45,470 
104,900 
210,370 


78.4 
21.6 


33,530 
14,845 
48,375 
164,900 
213,275 


77.8 
22.7 


140,106 
180,312 

37,026 
217,338 


230,366 


78.3 
21.7 


37,026 
17,612 
54,638 
180,312 
234,950 


76.7 
23.3 


139,887 153,333 
173,158 190,849 
34,410 40,900 
207,508 231,758 


218,557 250,951 


79.2 76.2 
20.8 23.8 


94,410 40,000 
19,058 21,177 
53,468 62,080 
173,158 190,849 
226,626 252,935 


< 
76.4 75.5 
2.6 245 
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TABLE B i: | 
LABOR AND PROPERTY SHARES IN THE CORPORATE SECTOR: THE PRIMARY DISTRIBUTION OF INCOM 


1929-1950 
(money figures in millions of dollars) 


1929 1930 1931 1932 1933 1934 1935 1936 


Property Sector: 
1, Corp. Profits After Tax. 8,188 2,318 -1,270 —3,390 —360 917 2,100 4,109 
2. Net Interest........ 1,0017 1,42 1,800 1,759 1,715 1,707 1,020 1,557 


3. Total Property (Concept I 


4, Comp. of Employees. . . 
5, Total Property (Line 3) 
6. Total Income (Concept I). 


9,805 4,060 523 —1,031 1,355 2,024 3,726 5,726 


Ratios: Concept I 
7. Total Labor (Line 4). 
8, Total Property (Line 5). . 


9. Total Property (Line 5). . 
10. Add: Corp. Profits Taxes. 

' 11, Total Property (Concept П) 
12. Total Labor (Line 4)... .. 
13. Total Income (Concept Ш 


- Ratios: Concept П 
М. Total Labor (Line 12). ......... 75.0 86.0 6.0 -- 90.3 85.0 82.7 78.2 
15. Total Property (Line 13)....... 25.0 14.0 40 -- 9.7 14.1 17.8 21.8 


18, Total Property (Concept III) 
19. Total Labor (Line 4). 
20. Total Income (Concept. 


Ratios: Concept III 
21. Total Labor (Line 19). . 78.2 77.4 76.0 74.0 
22, Total Property (Line 18). 21.8 22.6 24.0 26.0 


Bou; е: Dep. rtment of Commerce, National Income, 1951. 
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TABLE B (Continued) 
188 1989 1940 — 1941 — 1942 — 1943 — 1044 — 1945 — 1946 1947 1М8 — 1049 10% 
,042 4,821 6,218 9,155 9,208 10,408 10,515 8,274 13,427 17,825 10,808 16,506 21,083 
1,482 1,90 1,224 1,06 1,18 М9 85 74 470 605 5 600 — 548 
8,524 6,21 7,437 10,201 10,366 11,357 11,360 9,018 13,807 18,430 20, 17,160 22,281 
26,520 20,098: 32,076 40,064 52,096 63,197 00,183 62,100 68,884 81,220 80,003 87,302 90,547 
8,54 — 06211 7,437 10,261 10,366 11,357 11,360 9,018 13,807 18,430 20,498 17,166 22,231 
30,044 35,244 39,513 51,225 62,402 74,054 77,543 72,127 82,781 99,650 110,431 104,558 118,778 
| қ 
| 
вз 82.4 81.2 80.0 834 М8 85.4 87.5 83.2 81.5 81.5 83.6 813 
|1L7 1,6 188 200 166 152 М6 125 168 185 185 104 187 
3,54 6,21 7,437 10,261 10,366 11,357 11,360. 9,018 13,897 18,430 20,438 17,166 22,231 
1,40 1,400 2,878 7,86 11,065 14,406 13,525 11,215 9,583 11,940 13,028 10,980 18,509 
4,564 7,673 10,315 18,107 22,031 25,763 24,885 20,233 23,480 30,370 93,466 28,155 40,804 
4 520 20,033 32,076 40,964 52,036 63,197 66,183 63,100 €8,884 81,220 80,003 87,302 90,547 
31,084 36,706 42,391 59,071 74,007 88,060 91,008 83,342 92,364 111,590 128,459 115,547 197,371 
s 
5з — 701 тт өз оз по 727 БЛ иб 128 729 758 702 
ИЛ 009 из юл эт южо 93 из жа M2 жі MA в 
$54 621 7,437 10,261 10,366 11,957 11,360 9,018 13,897 18,430 20,438 17,106 22,291 
3,354 3,42 2,98 3,008 4,472 5,074 5,44 5,925 4,957 5,06 6,48 7,186 7,8% 
6,878 — 0,053 10,945 14,169 14,838 16,431 17,194 14,943 18,154 23,716 26,786 24,302 30,127 
0,520 29,033 32,076 40,064 52,036 63,197 400,183 63,109 68,884 81,220 89,993 87,392- 06,547 
8,808 38,080 43,021 55,133 66,874 70,628 83,377, 78,052 87,008 104,000 116,779 11,744 190,674 
d з 
(94 Бо мз из тв, 704 та 80.9 791 па та 782 762 
20.6 25.0 . 255 257 222 20.6 200 191 20. 226 220 2,8 238 
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© 
TABLE C 
LABOR AND PROPERTY SHARES: THE CLAIMS CONCEPT OF INCOME 
1929-1950 
(money figures in millions of dollars) 
1929 1930 1931 1932 1933 1934 1935 1936 
Property Sector: | 
1, Total Property ..... . 21,755 14,381 9,950 5,655 7,836 9,052 10,227 12,530 13 


2. Net Cap. Gains. Corps. 
3, Net Cap. Gains, Indiv.. 
4, Total Property (Concept I). . 


816" —200 -1,44 —1,563 —1,423 — 55 231 439 
. 2,803 —1,360 —2,718 —2,082 —1,403 —630 —89 589 
. 95,404 12,731 5,97 1,410 5,010 8,367 10,360 13,558 1344 


Labor Sector: 
5. Total Labor Income... . ‚ 04,871 56,723 47,073 25,452 35,002 40,724 47,015 52,737 М) 
371 377 383 391 388 427 462 767 м 
64,200 56,346 46,690 35,061 34,074 40,297 46,553 51,970 5, 
1,4099 1,5944 2,073 2,152 2,113 2,193 2,389 3,520 
65,009 57,890 49,363 37,213 36,787 42,490 48,042 55,490 
25,404 12,731 5,237 1,10 5,010 8,267 10,360 13,558 
. 91,163 70,021 54,000 28,623 41,707 50,857 59,911 69,048 


11. Total Income (Concept Т). 


Ratios: Concept 1 
12. Tota? Labor (Line 9)... 
18. Total Property (Line 10). 


72.1 82.0 90.4 96.3 88.0 83.5 82.5 80.4 
27.9 18.0 9.6 3.7 12.0 16.5 17.5 19.6 


25,464 12,731 5,07 1,410 5,010 8,367 10,360 13,558 15 
. 8,810 8,47 8,82 7,6% 7,45 7,218 7,360 7,64 7j 
- 34,280 21,478 13,549 0,073 12,255 15,585 17,738 21,22 2 
. 65,009 57,890 40,363 27,213 26,787 42,400 48,942 55,400 00 
. 90,079 79,268 62,012 40,286 49,042 58,075 66,680 70,732 818 


Ratios: Concept ПІ 
19, Total Labor (Line 17). . 
20. Total Property (Line 18) 


65.7 72.9 ^ 78.5 80.4 75.0 73.2 73.4 72.3 
34.3 27.1 21.5 19.5 25.0 26.8 26.6 27.7 


Source: Department of Commerce, National Income, 1951: capital gains from Lawrence Н. Seltzer, The Nature and Tat 
of Capital Gains and Losses (National Bureau of Economic Research, 1951), рр. 367, 531. 
5 Fgtimateg from Department of Commerce, National Income, 1961, Table 38, p. 202. 
Includes employer contributions for social insurance and for private pension funds and employee contributions for 
surance. 
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TABLE C (Continued) 


1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 


11,049 13,887 15,402 19,110 20,239 22,250 23,243 21,430 27,855 33,530 37,020 24,410 40,909 
75 70 -62 —956 —16 -10 63 641 1,209 924 975% 

8841 -20 487 -86 — —380 1,057 1,602 4,207 6,44 4,4506 4,325" 

10,788 13,604 14,353 17,258 19,683 23,147 24,908 26,338 35,708 38,004 42,326 


05,204 59,208 04,408 81,428 108,308 130,097 150,230 154,386 154,202 104,900 180,312 173,158 190,849 
2,22 2,292 2,452 2,067 3,715 4,908 5,897 7,019 7,222 7,208 7,084 7,008 9,379 
3,172 56,076 02,040 78,461 104,502 131,189 144,333 147,367 147,070 157,032 173,228 105,400 181,470 
52,834 2,903 3,119 8,19 3,150 2,971 3,597 6,179 11,420 11,803 11,285 12,352 15,082 
50,006 50,030 65,165 81,580 108,043 134,160 147,920 153,546 158,490 160,435 184,513 177,812 198,553 
10,788 13,604 14,353 17,258 19,083 23,147 24,908 26,338 35,768 38,004 42,326 

00,789 73,633 70,518 98,838 127,726 157,307 172,828 179,884 194,258 208,939 226,839 


81.6 81.3 81.3, < 


85.3 8&1 85. А 


4 
14.7 13.9 M.6 18.4 18.7 18.7 


1,788 13,004 14,353 17,298 19,083 23,147 24,908 26,338 35,708 38,904 
7,02 811 8,40 9,204 9,081 10,680 11,887 12,40 12,103 14,845 
18,775 21,705 22,703 26,552 29,604 33,827 20,705 28,748 47,081 53,749 
0,008 59,099 05,165 81,580 108,043 134,160 147,920 153,546 158,490 109,435 
4,781 81,74 87,058 108,132 137,707 107,987 184,715 192,294 206,421 223,184 


О 


74.9 788 ма 154 785 100 80k 198 768 759 

351 207 259 мв 215 в 2.1 199 ^ 202 232 941 
FS EE ap Re ufo E АХ каш уси CASE X 
-° Estimated from Treasury Department, Statiatice of Income. It has been assumed, for 1947 and 1948, that short term gains are 
egligible, Therefore, the net gains as reported for tax purposes have been doubled to produce the estimates here. 


. « 
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- 4 TABLE D 


LABOR AND PROPERTY SHARES, 1919-1929 
(money figures in millions of dollars) 


== 


1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 


==» 


.92 


4. Corp. Net Savings. A 
15.66 16.62 1. 


5. Total Property.... 1 
Labor Sector: 

6, Wages & Salaries... 36.7 43.3 34.9 36.4 42.7 42.7 44.4 47.4 47.8 48.7 51.5 

7. OtherPay.toEmpl .43  .57 .60 .60 .62 .62 .61 .62 .65 .66 .69 

8. Entre. Withdr. .. 11.8 13.5 10.3 10.8 11.8 11.9 12.5 12.5 12.6 12.9 13.4 

9, Entre. Savings. 5.5 1.6  .63 —.09 1.2 ,7 1.0 2111  .91 1.1 


9 

2 5 
.. 3. 8 
1 9: 
1 6: 


10. Total Labor....... 54.43 58.97 46.43 47.71 55.82 56.09 59.11 62.62 62.15 68.17 66.69 
11. Total Property 
(Line 5) 11.1 13.4 12.11 12.13 14.17 15.66 16.62 18.3 


14. 8 
12. Total Income. 65.53 72.37 58.54 59.84 09.99 70.31 74 79.42 77.81 79.79 84.99 
Ratios: 


18. Total Labor (Line 


10). mo 83.1 81.5 79.3 79.7 79.8 79.8 79.4 78.8 79.9 79.2 78.5 
14. Total Property 
(Line 11)......... 16.9 18.5 20.7 20.3 20.2 20.2 20.6 21.2 20.1 20.8 21.5 


Source: Simon Kuznets, National Income And Its Composition, 1919-38, Vol. I (National Bureau 
of Economic Research, 1941), pp. 216-17. 


SUMMARY 


The major conclusions which may be drawn from the foregoing in- 
vestigation of labor and property shares in total income are the follow- 
ing: : 

1. Since the Twenties labor shares in total income have tended gen- 
erally to inerease. The recovery period of the Thirties shows labor 
shares higher than in the Twenties, and higher in World War II than 
in the Thirties. However, the postwar period has tended to reduce labor 
shares very nearly to their position in the late Thirties. 

2. The available data on the distribution of income by size class, 
whei: cotapared with changes in labor and property income shares, 
would seem to indicate that increased labor shares are generally associ- 
ated with reduced concentration in the distribution of income. 

3. As economic activity declines, labor shares tend to increase. AS 
economic activity increases, labor shares tend to decline. A decline in 
labor shares (except for World War II) seems to be accentuated by 
inflationary conditions. 

4, The measurement of income on the basis of claims, rather than on 
the basis of primary distribution, alters somewhat the relative change 
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in labor and property income since the Twenties, but does not alter the 
relationship between changes in economie activity and changes in 
labor's relative share. 

5. Where the measurement of income includes capital consumptidn 
allowances, relative shares appear to be considerably more stable be- 
tween prosperity and depression. 

The conclusion which is of most general theoretical interest, is the 
third—that increases in economic activity are associated with a reduc- 
tion in labor shares. This point is particularly significant if a reduction - 
in labor shares is linked to increased concentration in the distribution 
of income. For this will mean that, if the marginal propensity to con- 
sume of property income recipients is lower than the marginal propen- 
sity to consume of labor income recipients, increases in economic ac- 
tivity will tend to increase savings ratios and, therefore, increase the 
volume of investment necessary to maintain the higher level of income. 
Since inflation appears to accentuate the shift to property income, it 
may be said that the inflationary process itself makes it more difficult 
to maintain prevailing levels of national income. Unfortunately, our 
present knowledge of consumption ratios by income class does not per- 
mit a final assessment of the significance of this point. We do not now 
know the range of variation in the marginal propensity to consume by 
income class or by type of income. 4 

It is also of considerable interest that the “normal” pattern of rela- 
tionship between labor and property shares and changes in economie 
activity was substantially altered during World War II. Тһе increase іп 
economie activity during this period was associated with an increase, 
not a decrease, in labor's share of total income. A comparison of this 
experience with the experience of the years 1946-48 and 1950, when 
labor shares were declining, suggests that the effective stabilization 
program of the war уеаїв may be responsible for the large increases in 
labor shares of total income in these years. 


THE POST-ENUMERATION SURVEY OF THE 1950 CENSUS: 
A CASE HISTORY IN SURVEY DESIGN 
% 


я Еш 8. Marxs, W. Parker MAULDIN AND HAROLD NISSELSON 
Bureau of the Census 


IHE primary purpose of the present paper is to trace the develop- 
ment of the Post-Enumeration Survey, which was designed to 
, measure error in the 1950 Censuses, and to outline the reasons for the 
decisions made and the alternatives considered in designing this survey. 
"This paper points up the extent to which decisions on survey design 
have to be reached on the basis of intuition and opinion, due to the 
absence of айу satisfactory objective data. Such analysis directs atten- 
tion to the major gaps in the knowledge of survey technique—the 
spots where there is need to supersede art with science—and may raise 
as subjects for investigation, points which have been implicitly ac- 
cepted as axiomatic. For example, discussions of field methods usually 
devote considerable space to improving the selection and training of 
interviewers but practically no attention is given to the problem of 
whether the gains from such improvements are commensurate with the 
effort and expenditure involved. 

The existence of errors in à Census is readily apparent in view of the 
vastness of.the undertaking, with its necessarily large and (in large 
part) inexperienced enumerating staffs who are called on to obtain facts 
on a wide variety of subjects from a group of respondents who may be 
given only a partial understanding of what is wanted and who may 
have a very incomplete knowledge of the information requested. In 
addition to these and other problems in the collection of the data, there 
are also human and mechanical failures in the processing of the data 
that are not always detected. Е : 

Тһе recognition of the existence of such errors is not new. For many 
years, the Census volumes have contained introductory statements 
pointing out inconsistencies and possible sources of error in the statis- 
tics. Also, analyses of error in, and questions regarding the accuracy of, 
Census data can be found in numerous articles, a few of which are 
listed at the end of this paper. 

What is relatively new is the attempt to provide a measure of the 
errors on the basis of an independent enumerative check. Checks of 
this sort were conducted in connection with the 1945 Census of Agricul- 
ture, the 1947 Census of Manufactures, and the 1948 Census of Busi- 
ness. These experiences formed the basis for planning the 1950 Census 
Post-Enumeration Survey. 
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The design of the Post-Enumeration Survey was premised on a need 
for a high level of accuracy—a level substantially higher than required 
or attained in the Census. Since the Census was taken with considerable 
attention to securing accurate results, such a requirement was quite 
rigorous. On the other hand, it was agreed that perfection was neither 
attainable, nor necessary, in order for the results to be useful. 

Not only was there the desire to measure with precision the 
amount of error, but also, to the extent possible, to determine the 
reasons for error. This second feature is particularly significant since 
exploration of the sources of error should provide our most valuable 
guide to improving future surveys. It can be seen that a study of “re- 
sponse variation” would not answer these purposes. Measures of re- 
sponse variation—i.e., the differences which occur when identical ques- 
tions are asked under (presumably) identical conditions—help to point 
out weaknesses in our data. They do not, however, throw much light 
on how to correct these weaknesses. The fact that two responses differ 
indicates that one (or both) of the responses is in error but does not 
indicate which one (if either) is correct. 

A brief statement indicating the organization of the Bureau may 
help in understanding how the Post-Enumeration Survey was planned 
and conducted. Under the Director and Deputy and Assistant Direc- 
tors, there are a number of Divisions. Those concerned in the Degennial 
Census, and hence in the Post-Enumeration Survey, were the Agricul- 
ture Division, the Field Division, the Geography Division, and the 
Population and Housing Division. The Office of the Assistant Director 
for Statistical Standards was given the responsibility for coordinating 
these interests, All major issues were reviewed and decisions formulated 
or revised in a committee consisting of representatives of all these or- 
ganizational units. In addition, valuable advice and assistance was re- 
ceived from others outSide the Bureau. This paper is, necessarily, a 
report, not of work done by a single individual, but of decisions and 
plans formulated by a group. Ў 

In preparing the paper, minutes of meetings and other working rec- 
ords have been consulted but many of the gaps in these records (par- 
ticularly in the area of alternatives considered) had to be filled in from 
memory. Of necessity, the description of the survey is telescoped and 
many important details have been omitted. А complete description of 
the checking methods used in the Census, or even of the Post-Enumera- 
tion Survey itself has not been attempted. Materials of this type will 
be incorporated in the publications giving the results of the Census and 
Post-Enumeration Survey. 
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SOME MAJOR FACTORS IN THE SURVEY DESIGN 


The Post-Enumeration Survey was primarily concerned with: 
„А. Coverage errors, consisting of: 
(1) Omissions of persons, households, dwelling units, and farms. 
(2) Duplieations of persons, households, dwelling units, and 
farms. 
(3) Inclusion of persons, households, dwelling units, or farms, 
but in the returns for the wrong area. 
B. Content errors, consisting of omissions or incorrect entries as re- 
вропвев to the specific questions on the schedules. 


One of the first problems that had to be met was that of sample de- 
sign. In designing the sample several factors had to be considered: 

(1) Measurement of omissions. For most purposes, the Census pro- 
vides a satisfactory listing of the population. It cannot, of course, pro- 
vide a satisfactory listing for the purpose of measuring its own com- 
pleteness and, in general, any other source would be even less satis- 
factory. Thus, it was necessary to find some technique for drawing a 
sample of those units (persons, households, families, farms) which were 
not listed in the Census. The only technique available appeared to be 
to draw a sample of small areas (segments) and make special efforts in 
the Post-Enumeration Survey to list all units which were in the Seg- 
ment at the time of the Census. It could then be determined which 
units were enumerated by comparing the Post-Enumeration Survey 
segment listings with the Census listings. 

(2) Measurement of duplications and inclusions in the wrong area. 
Checking “coverage” of the Census involves both the completeness of 
the enumeration and its accuracy. The scgment sample would provide 
а measure of completeness. For checking accuracy of coverage, how- 
ever, it was also necessary to determine how many units had been 
erroneously included in the Census listing. To identify erroneously-in- 
cluded units by comparison of the Post-Enumeration Survey segment 
listings with the Census listings would have required constructing a 
Census listing for the segment. While all units in the Census can be 
identified as being located in a particular Enumeration District (the 
area assigned to a particular enumerator), the Post-Enumeration Sur- 
vey segments were considerably smaller than an Enumeration District. 
In many cases (approximately 20 per cent of the total), addresses | 
shown within Enumeration Districts are not specific enough to identify 
the location by segment. Thus, any segment listing constructed from 
the Census would inevitably contain units which had been enumerated 
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in the correct Enumeration District but which were not actually lo- 
cated in the segment to which they had been assigned. This would re- 
sult in an overestimate of the number of erroneously included units and 
would also increase the sampling variance of estimates of net coverage 
error (i.e., the difference between erroneously omitted and erroneously 
included units). 

An alternative would have been to estimate the number of errone- 
ously included units by subtracting an estimate of the number of units 
in the Census which were enumerated in the correct place (as deter- 
mined from the Post-Enumeration Survey interviewers’ segment list- 
ings) from the total number of units enumerated in the Census. Assum- 
ing a fixed total expenditure, however, an estimate obtained in this 
manner would be subject to a considerably larger sampling error than 
one obtained by sampling the Census listings and directly identifying 
the units erroneously included. 

In addition, the indirect technique would not give any clues regard- 
ing the reasons for over-enumeration, i.e., how many over-enumerations 
were a result of carelessness, how many were a result of complexities or 
ambiguities in the rules determining who should be enumerated, ete. 
The segment technique would require a third visit to determine the 
reasons for over-enumerations. e 

Thus, it was decided to use two samples, a segment sample for units 
Which were missed and a sample of Census listings to measure erroneous 
enumerations. Since an interview to check on over- or under-enumera- 
tion was necessary for every unit in both samples, it would have been 
desirable to overlap the samples to the maximum extent possible. Com- 
plete overlap would have required that one be able to identify in ad- 
vance the segment in which each Census listing was located. Although 
this was impossible for the reasons discussed above, an effort was made 
to obtain a high degreé of overlap. 

(8) Measurement of content error. With this design, the units which 
Were defined correctly (e.g., a farm which existed, which should have 
been enumerated and which was enumerated in the correct‘arefi, but 
not necessarily enumerated with the proper characteristics) might have 
been obtained from either or both samples. Information from such units 
was desired to determine whether their characteristics had been cor- 
rectly reported. 

Where overlap of the two samples was not achieved, it was necessary 
either to take complete interviews on all units in both samples and 
adopt a complex weighting system or to restrict the check on accuracy 
of characteristics for correctly included units either to those in the “seg- 
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ment sample" or those in the “list sample." Complete interviews for all 
units would, of course, have been more expensive than restricting com- 
plete interviews to only one of the samples (particularly when the cost 
of’ weighting is included іп the expenditures). The additional expendi- 
ture would not have been justified by the increase in information ob- 
tained. Because it was considered desirable to furnish the check inter- 
viewer with the information obtained in the Census for a given unit 
(see the discussion below of “Transcription”), it was decided to restrict 
complete interviews to units in the sample of Census listings since they 
could be identified in advance. It was also necessary, of course, to ob- 
tain complete information from the “missed” units in the segment 
sample. 3 

(4) Length of interview. In designing the Post-Enumeration Sur- 
vey interview, emphasis was placed on investigating each variable 
thoroughly even if this meant restricting the number of variables to be 
checked. After the elimination of a great many Census variables from 
the Post-Enumeration Survey, it was found that the interview would 
still be quite extensive if the remaining variables were to be checked 
thoroughly. This did not present any serious problem except in farm 
households where it might be necessary to check the definition of the 
dwelling unit and of the farm, the completeness and correctness of the 
listing of persons, the housing characteristies of the dwelling unit, the 
farm characteristics, and the characteristics of each member of the 
household. Experience in a pretest indicated that a full interview for 
farm households would average 12 hours and that the upper quartile 
would exceed two hours. It was felt that interviews of this length would 
be undesirable. 

The alternatives were (a) to drop additional items from the check, 
(b) to check some (or all) of the items less thoroughly, or (c) to check 
some of the items in one sample and others in’another—in particular 
to separate the farm check from the check on population and housing 
characteristics. Dropping items from those originally selected was not 
an acceptable solution since appreciable savings in interview time could 
be made only by eliminating items which were of primary importance. 
A less thorough check was also undesirable—more, rather than less, 
investigation might be necessary to insure results of the required ac- 
curacy. Checking different items in different samples was feasible but 
might mean more complex instructions to the field staff, complications 
in processing the data, increased cost or increased sampling error and, 
possibly, increase in the error from sources other than sampling varia- 
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tion. Despite its disadvantages, this alternative did permit eutting the 
length of interviews in farm households without affecting the interviews 
for nonfarm households. р 

Ап additional factor in the decision was the complexity of the job 
facing the interviewer—he would have to master the definition of a 
dwelling unit and of a place requiring a farm check; the definitions of 
more than 100 different Census variables; the problems of defining 
“usual residence" and “farm headquarters"; the special procedure for 
occupied living quarters which did not meet the dwelling unit definition 
and for other places which were by definition not dwelling units; the 
use of maps and aerial photographs; the preparation of rough detail 
maps; the methods of recording on seven different forms; how to inter- 
view and how to probe for causes of discrepancies between his own re- ` 
sults and those obtained by the Census enumerator; and the various 
special modifications of the procedures required by subsidiary studies 
which were linked to the Post-Enumeration Survey. 

In the light of these considerations, it was decided to draw three 
samples in rural areas: the segment sample to be checked to locate 
missed households and farms; a sample of Agriculture Census listings 
to be checked for over-enumeration and for farm characteristics (the 
farm sample was overlapped with the segment sample as much as possi- 
ble); and an independent sample of listings from the Population and 
Housing Census to be checked for over-enumeration on population and 
housing characteristics. The segment and farm samples were assigned 
to rural interviewers and the household sample to urban interviewers. 
The split was not made in urban areas where farm checks would be 
relatively infrequent. 

Although the decision was predicated in part on the assumption that 
length of time required for a complete interview in farm households 
would be excessive, this is by no means an established fact. The length 
might have been feasible and, even if it were not, there were other 
Means of reducing interview length—actually it was later decided (see 
below) to subsample individuals within households which cut he length 
of interview per household, but in view of the other considerations and 
the pressure of circumstances, the decision to check agriculture items 
and population and housing items in separate samples was not re- 
examined. 

(5) Selection, training and supervision of interviewers. The objectives 
of the Post-Enumeration Survey required that interviewers be care- 
fully Selected, well-trained and competently supervised. With a fixed 
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budget, increasing the number of interviewers meant either reducing 
the training and supervision expenditures per interviewer or cutting at 
some other point. 


INITIAL DECISIONS ON SURVEY DESIGN 


As a first step in design of the sample, comparisons were made of 
estimated costs of different designs. Preliminary computations pointed 
to use of a design involving 300 primary sampling units. Drawing of à 
sample on this basis was, therefore, initiated. A two-stage sample was 
drawn, drawing primary sampling units consisting of а county ога 
group of contiguous counties and within the primary sampling unit 
selecting segments containing 6 households in urban areas or 10 house- 
holds in rural areas. Later recalculation as shown in Table I indicates 
minimum cost for 200 primary sampling units. With the rough assump- 
tions made about costs and intraclass correlations, the differences be- 
tween 200 and 300 primary sampling units (with the same staffing pat- 
tern) are not of magnitude sufficient to form a basis for choice and the 
design with 300 primary sampling units actually used is presumed to be 
near the minimum cost design. 

Final computations made to compare different survey designs are 
summarized in Table I. Desjgns 1 to 7 assume that all interviewers 
"would conduct population, housing and agriculture interviews. Designs ' 
8 to 18 assume that one set of interviewers (urban) would conduct 
population and housing interviews and another set (rural) would con- 
duct agriculture interviews, within the same sample of primary sam- 
pling units but in different segments. Designs 2, 3, 6, and 7 show differ- 
ences in cost associated with different numbers of primary sampling 
units in the situation where each interviewer would be assigned one 
primary sampling unit. Designs 1 and 4 indicate the case where each 
primary sampling unit is split between interviewers. In this case, there 
is а substantial increase in training and supervision costs with no com- 
pensating economies (under the assumptions made). In Design 5 there 
is only 1 interviewer for each 2 primary sampling units, which involves, 
therefore, an increase in travel and enumeration costs (mainly due to 
the fact that the interviewer would be paid subsistence allowance while 
working away from his home primary sampling unit) but this increase 
is compensated by a decrease in the, training and supervision costs. 
Designs 8 through 13 show similar variations in the more complex situa- 
tion where two types of interviewers are used and the ratios of rural 
interviewers per primary sampling unit and urban interviewers per 
primary sampling unit are varied. 
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Within the selected primary sampling units, it was decided to us 
segments containing about 6 households in urban areas and about 1 
households in rural areas. From a standpoint of sampling variance, 
Size of segment may be too large for maximum efficiency. On the other 
hand, smaller segments would have meant problems in obtaining clear | 
definition of boundaries, thus introducing “coverage” errors into th 
Post-Enumeration Survey. 


SPECIALIZATION OF INTERVIEWERS 


The decision to separate the agriculture check from the check of 
population and housing and to specialize the interviewers has been 
noted above. While estimates of the cost of such specialization are avail- 
able (see Table I), no estimates of the gains have been made—Table I 
presents cost for a fixed sampling error but does not take into account. 
other types of error. It was anticipated that the complexity of the inter- 
viewer's job might result in some confusion on his part so that his per- 
formance would be seriously impaired. It was also felt that, within the 
available training budget, intensive instruction on all phases was im- 
possible unless the scope of the job was reduced. Thus, the anticipated 
gain from specializing interviewers was in accuracy and in reduction о 
the cost of correcting inadequate work. The magnitude of these gain: 
could not be estimated even roughly in advance, and the survey itself 
yields only impressions on this point. While the decision made assumes | 
implicitly that the gain was worth an estimated 20 to 30 thousand | 
dollars additional cost (the difference in cost as shown in Table I 1 
approximately $35,000 to $45,000, but part of this is chargeable to 
specialization and part to the additional sample in rural areas), this is à | 
purely intuitive assumption. Subjective: impressions indicate that in- | 
terviewer training and initial performance were better than might have 
been obtained without specialization. у 


TRANSCRIPTION 


Obviously the sample design affected (and was affected by) almost | 
all phases of the Post-Enumeration Survey. Another problem which. 
had considerable ramifications was the question of whether or not th 
te-interviewer should or should not be provided with the results of the 
original enumeration. ү 

There аге definite advantages (at least їп theory) to giving the check: 
ers the original data: (1) it permits the re-interviewer to check his owa 
answers against the original entry and to determine more positivel; 
what the correct answer is; and (2) important leads can be obtained fo 


€ 
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the improvement of future surveys by having the re-interviewer set 


· down as much information as possible about the reasons for any errors 


found. On the other hand, there may be danger of minimizing the ex- 
tent of error due to the interviewer "confirming" the original entry even 
where it is wrong. In an early Census pretest there was evidence that 
this occurred. However, the post-enumeration survey interviewers in 


TABLE II 


PER CENT OF POPULATION AND HOUSING SCHEDULES SHOWING 
A DIFFERENCE BETWEEN ORIGINAL AND RE-INTERVIEW 
ENTRIES, POST-ENUMERATION SURVEY OF MAY 1949 
CENSUS PRETEST 


Group A* Group B* 
(Census Entries Available | (Census Entries Not Avail- 
Неш to Post-Enumeration able to Post-Enumeration 
Survey Interviewers) Survey Interviewers) 
. Age 29 25 
` Place of Birth t Y 
Work Status 11 8 
Occupation 29 81 
Industry 14 е 19 
Class of Worker 25 23 H 
Highest School Grade 41 38° 
Condition of Unit 19 18 
Water Supply 6 15 
Bathtub or Shower 2 8 
Tenure 11 18 


* Percentages for housing items based on 54 Group А and 60 Group B cases, for highest school 
grade on 37 Group A and 34 Group B cases®and for other items оп 100 to 200 cases, Figures for Group 
А represent the differences remaining after compaxison and reconciliation between the re-interviewer's 
Initial entry and the figure transcrihed from the original enumeration. Blank or omitted entries were not 
counted as differences. 

1 Less than one per cent. 


this pretest were not specially selected or trained, and little reSiriction 
Was placed on their use of the original data. 

The technique finally evolved for use of the original data in the Post- 
Enumeration Survey gave the interviewer the information obtained in 
the Census for the “characteristics” of persons, households, and farms. 
He was not to look at this information until he completed the inde- 
pendent interview and, to reinforce this instruction, the original Census 
entries were transcribed to a page which folded in so that inadvertent 
examination of the entries was practically impossible. After the inter- 
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viewer completed his independent questioning, he compared his entries 
with the Census data and attempted to reconcile the differences by 
further questioning of the respondent. 

In a test of the effectiveness of this procedure, conducted following а 


TABLE III 


NUMBER OF AGRICULTURE QUESTIONNAIRES SHOWING A 
DIFFERENCE BETWEEN ORIGINAL AND RE-INTERVIEW 
ENTRIES, POST-ENUMERATION SURVEY OF MAY 
1949 CENSUS PRETEST 


Group B 
Group A (Census entries not 
(Census entries available to available to Poste 
Post-Enumeration Survey Enumeration Survey 
interviewers) interviewers) 
Item 
No. of differences 
No. of Before After No. of No. of 
reports | reconcilia- | reconcilia- | ТӨрогін | differences 
tion* tion* 
Corn 1 
асгер 24 14 12 22 15 
bushels 23 19 18 22 17 
Cotton 
acres 23 12 10 15 9 
bales 23 10 9 15 8 
Fruit and nut trees 31 7 7 24 5 
Number of cattle 27 9 5 22 6 
Chicken eggs, sold ) 
dozens 18 16 > 11 16 14 
value 18 16 11: 16 15 
Work off the farm 33 14 14 26 11 


* Figures in the column “Before reconciliation” represent differences before the re-interviewer had 
examini ie data from the original enumeration, Figures in the column “After reconciliation” indicate 
number of differences remaining after comparison and reconciliation between the re-interviewer's initial 
entry and the figure transcribed from the original enumeration. 


Census pretest in May 1949, the interviewer was not given the original 
Census information for one-half of the sample cases. The results (see 
Tables П and III) indicated that the interviewer's initial entries showed 
about the same proportion of differences from the Census entries re- 
gardless of whether or not he was given the original Census information. 
In some cases, however, the reconciliation disclosed errors in the inter- 
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viewer's initial entry which he could then correct. Because of the small 
size of the sample and difficulties in defining the population which it 
represents, dependable generalizations cannot be made from these data. 
However, the data would indicate that it was possible to secure the 
advantages cited above of supplying re-interviewers with the original 
Census data without serious effects in the direction of “confirming” er- 
roneous entries. As further control for a small proportion of units (5 per 
cent of the households and 10 per cent of the farms) in the Post-Enu- 
meration Survey sample, no transcription data were supplied the re- 
interviewer. 


CHECKING ACCURACY OF INTERVIEWING 


The subsample in which the interviewer was not given the original 
Census data supplies one check on the accuracy of his work. The rela- 
tive number and magnitude of the differences when the interviewer is 
given the original Census data should be about the same as when he 
does not have the data, unless he has looked at the information instead 
of obtaining an independent answer. 

Another vital problem in the interviewing was making every reason- 
able effort to insure completeness of the Post-Enumeration Survey 
canvass. The basic procedure involved giying the interviewer a list of 
persons, households, and farms which he was to complete. As a check, 
it was decided to delete the names of some persons actually’enumerated 
in the Census and see whether or not the Post-Enumeration Survey 
interviewer “picked up” those persons. The re-interviewer's picking-up 
of the omitted person indicated that he was correctly carrying out the 
procedure for the detection of missed persons. It is, of course, no guar- 
antee that the procedure wag effective in detecting missed persons. 

A sample of about 170 Post-Epumeration Survey segments was re- 
checked by professional personnel of the Bureau to determine the ac- 
curacy of the identification of missed households. The results of this 
Te-check indicated that the Post-Enumeration Survey may have under- 
estimated the (net) number of missed dwelling units by 30 pcx cent. 
This estimate is, however, subject to а sampling error of 15 per cent. 
The effect of this deficieney in the Post-Enumeration Survey on the 
estimates of missed persons appears to be small since the omitted 
dwelling units were, in large part, vacant units or units with only one 
occupant, 

DESIGN OF THE INTERVIEW 


In developing the questions used in the Post-Enumeration Survey, 
the possibility of simply repeating the original Census questions was 
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considered and, quite early, rejected. It was felt that this procedure 
would give a measure primarily of ^variance" rather than of accuracy. 
Every effort was made to incorporate checks on those features which 
seemed to be a source of error in previous surveys. In many cases, 
several questions were asked in the Post-Enumeration Survey to elicit 
data obtained by a single question in the Census. The purpose of mul- 
tiple questions was to prevent error by calling the attention of respond- 
ents and interviewers to any inconsistencies or misunderstandings and 
to assist the respondent toward a more complete recall of the desired 
information. The development of the questions on income illustrates 
this procedure. 

In the Census three questions were asked about income. These ques- 
tions referred to (1) wage and salary income, (2) income from own 
business, (3) other kinds of income (e.g., pensions, rent, dividends from 
stocks, bonds, etc.). Considerable experimentation was carried on with 
question wordings pertaining to income in various Census surveys and 
also in the pretest of the Post-Enumeration Survey. Most of the varia- 
tions in the wordings of income questions attempted to check on the 
specific errors which were thought to be most frequent. 

In a special study on housing in Baltimore, a different approach, the 
“job history” approach, was-used to obtain a check on the accuracy of 
wage and salary income data. The essential feature of this approach is 
accounting for all periods of employment or unemployment during the 
year. For each job held, a series of “check” questions on income is 
asked and for each period of unemployment, questions are asked about 
unemployment compensation. The job history approach makes for a 
considerably longer interview but the results of the Baltimore study 
indicated that the job history approach detects income which may be 
omitted when other techniques are used. Consequently, for the Post- 
Enumeration Survey, the job history approach was adopted. ` 

Another important consideration in design of the interview was the 
specification of the respondent. In the Census, the interviewer was per- 
mitted б obtain information from any one of several qualified respond- 
ents. It was suggested that the interviews in the Post-Enumeration 
Survey be obtained from the person himself (except for minor children, 
mentally incompetent individuals and similar cases). It was antici- 
pated that this would improve the accuracy of the responses (since the 
person himself is usually the best informed respondent for his own 
characteristics) but might be excessively costly and might also result 
in a higher non-interview rate. The costs and non-interview rates were, 
therefore, explored in the Post-Enumeration Survey pretests. These 
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pretests showed that the restrictions on the respondent resulted in no 
significant increase in non-interview rates and somewhere between 40 
and 50 per cent additional calls (and, presumably, about this increase in 
total travel and “contact” costs). It was decided that the cost increase 
would not be prohibitive and the Post-Enumeration Survey interview- 
ers were told to interview only the person himself (with the exceptions 
noted). No data are available for determining whether the increased 
costs were justified. It was proposed that a test of this point be in- 
corporated in the Post-Enumeration Survey but the proposal was re- 
jected as representing а substantial additional complication in both 
transcription and interviewing. 


TIMING OF THE POST-ENUMERATION SURVEY 


It was necessary to weigh а possible loss of accuracy if the Post- 
Enumeration Survey were started late against a loss if it were started 
early. If the interviewing were to be postponed, respondents would be 
questioned about an event which might be increasingly remote in time 
(and possibly in place). This is not serious for “static” items such as 
date of birth, but is very important for items whose recall tends to be 
obscured by subsequent events such as jgbs held and income received 
during 1949. à 

On the other hand, it would have been undesirable to start the Post- 
Enumeration Survey before the completion of the Census enumeration. 
Assurance that there would not be a bias in the Census enumeration of 
units in the Post-Enumeration Survey sample could be achieved only 
by making sure that these units were not distinguished in any way 
from other units during the, Census. Theoretically, a given segment 
could have been started as soon as enumeration had been completed 
for the Enumeration District in which it was located. Actually, since 
Enumeration Districts, even with a restricted area such as a county, 
are completed at widely different times, there were two objections: 


1. Selection of interviewers from a given county would have been 
notification that this county was in the Post-Enumeration Survey 
before all work in the county had been completed. 

2. Before the Post-Enumeration Survey work in a segment could 
start it was necessary to transcribe information from the Census 
schedules. If this had been done in the local Census office, the 
local supervisor might have been motivated to have the Post- 
Enumeration Survey areas checked and the check might have re- 
sulted in alterations of the Census results. 
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Consideration of the dangers of bias led to a decision that transcription 
be done after all Census enumeration іп а county had been completed 
and sent to the central Census processing sections in Washington and 
Philadelphia. 

Tn retrospect, the disadvantages of an early start seem as serious a8 
they did at the time; but costs of delay, even beyond the delay antici- 
pated, seem to have been less serious, i.e., observation of the Post- 
Enumeration Survey interviewing suggests less recall loss and less diffi- 
eulty in locating respondents than had been feared. 


SELECTION OF INTERVIEWERS 


It was generally agreed that the nature of the Post-Enumeration 
Survey required interviewers of top competence. What was meant by 
*top competence? and how to select such interviewers were matters on 
which it was almost impossible to secure agreement. Among the qualifi- 
cations one might list for a “good interviewer" are intelligence (or, more 
generally, “alertness”) and ability to get people to talk freely (secure 
*rapport"). For Census work, certain less obvious qualifications seem 
to be important. Coverage is essential and coverage requires meticu- 
lousness and persistence; requires, in fact, an “over-conscientiousness” 
which may be in direct opposition to the characteristics ordinarily as- 
sociated with a “good interviewer.” Checking answers requires an in- 
terviewer who will critically evaluate each answer and will not accept it 
until he has probed the matter to the hilt. However, a critical attitude 
alone is insufficient—the respondent’s cooperation and interest is essen- 
tial and the interviewer must be able to secure these and still maintain 
a “cautious skepticism.” 

All of the above are qualifications hard to define and equally hard to 


recognize even if found. To set down criteria (with sufficient exactness ~ 


to make them usable) was practically impossible. One suggested pro- 
cedure was direct recruitment by persons traveling out of the central 
office. Such persons might, it was felt, apply adequately criteria of the 
type cited without elaborate written specifications. Aside from the fact 
that such an advantage was only hypothetical and the fact that suffi- 
cient central personnel were not available and that there are counter- 
balancing advantages to field selection of the interviewers, considera- 
tions of morale in the field organization can be raised against such a 
procedure. It would seem desirable to let the immediate supervisor 
have final choice of the interviewers with the accompanying feeling of 
responsibility for their performance, i.e., not put the supervisor in the 
position of excusing poor performance as being the work of “a bunch of 
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deadheads foisted on him by the central office." On the other hand, 
complete freedom of choice for the local supervisor might, in some cases, 
mean the placing of personal friendship or other considerations above 
basic qualifications for the job, particularly if interviewer selection were 
to be initiated before the supervisor had been given a complete picture 
of the scope of the job required. These considerations pointed to re- 
stricting the supervisor's choice to those candidates meeting specified 
minimum requirements but, within this class, leaving the final choice 
in the supervisor’s hands. 

Another knotty problem was whether supervisors and interviewers 
should be selected only from persons with previous Census experience. 
With regard to the supervisors, there was unanimous agreement that 
previous Census experience was essential, since an inexperienced person 
would have his energies diverted from his major tasks by difficulties 
involved in mastering Census administrative procedures, which, how- 
ever, would be familiar to experienced Census supervisors. It was, 
therefore, decided to select Post-Enumeration Survey supervisors from 
experienced members of the Bureau’s permanent field staff. 

Familiarity with Census concepts and procedures was also desirable 
in the interviewers and selecting persons with Census experience would 
mean cost savings in interviewer training, It was also felt that among 
the 150,000 enumerators employed on the 17th Census, there must be 
many of outstanding competence and we might be more successful in 
locating these than in locating competent interviewers among candi- 
dates whom we had not had an opportunity to observe in an actual 
interviewing situation. 

As against the cost argument, there was the feeling that restriction 
of. selection to Census enumerators ruled out classes of persons among 
whom there might be some highly desirable candidates (e.g., graduate 
students, teachers and professors, and-experienced persons from other 
Survey organizations available for the Post-Enumeration Survey dur- 
ing the summer but not for Census work in the spring). It was also felt 
that the problem of getting reliable reports on an individual's per- 
formance as a Census enumerator was considerable since the enumera- 
tors worked for only a short time and there was little opportunity to 
Teview their work. 

Another objection to the use of interviewers who had worked on the 
Census was the possibility of bias if а person happened to be assigned 
to check his own work or work done under his supervision. The possible 
bias arising from an interviewer's checking his own work is trivial, since 
the probability of cases from any Enumeration District being in the 
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sample was quite small (and most enumerators worked in only one or 
two Enumeration Districts). Bias due to the interviewer checking work 
which he supervised in the original enumeration might be more sub- 
stantial but could be (and was) avoided by not assigning any inter- 
viewer to any area for which he was responsible during the Census. 

Since there was no basis for assuming that one method of selection 
was better than another in terms of quality, it was finally decided on 
cost grounds to restrict the interviewer selection to persons with ex- 
perience in the Census. This probably imposed only a relatively minor 
restriction (since only 260 interviewers were finally selected and the 
available pool was many thousand). On the other hand, it is by no 
means certain that another type of selection would have meant either 
substantially increased cost or lower quality. 

Tt was necessary, in large part, to abandon the attempt to spell out 
detailed selection criteria. As a first step, selection was restricted to 
persons who made a high score on the general selection test given to all 
candidates for enumerators in the regular Census. Additional criteria 
finally adopted were that (1) the interviewer score in the top two-thirds 
on a Post-Enumeration Survey selection test which measured ability 
to read and follow instructions and a schedule of the type actually used 
(but did not get at any of the personality characteristics cited above); 
(2) geagraphic distribution of interviewers be restricted so that about 
two-thirds of them would live in an area they were to enumerate; and 
(8) supervisors interview at least 4 candidates for every interviewer 
position. 


TRAINING OF INTERVIEWERS 


Considerable experience was available from the Census and other 
previous studies on techniques of interviewer training. Such techniques 
had apparently worked satisfactorily and it was decided (mostly as a 
matter of convenience) to use the same methods with emphasis on: 


1. The interviewer going through several actual interviews in the 
field under observation as part of his training— it was felt that 
this was important in making the “classroom” training more 
meaningful. 

2. Problem solution and the “role-playing” technique (here, the 
trainer acts as respondent and is interviewed by one of the 
trainees while all of the trainees record the answers). 


It had been decided to use as interviewers only persons with previous 
Census experience. However, consideration of the possibilities of inade- 
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quate original training and development of *bad habits" during their 
previous work, led to the decision that the Post-Enumeration Survey 
training repeat and (on Базе points) amplify the Census training. The 
Post-Enumeration Survey interviewers were given considerable addi- 
tional material on maps and canvassing (which were felt to be impor- 
tant for good coverage) and on interviewing techniques, particularly 
the problems of verifying an answer and probing for reasons for dis- 
crepancies from the original Census response. Considerable emphasis 
was placed upon explaining to the interviewers the reasoning behind the 
phrasing of the questions and the design of the procedures. 

In training the supervisors they were first put through the training 
designed for the interviewers (combining, with appropriate modifica- 
tions, the two training courses planned for rural and urban interview- 
ers). Examination of this experience indicated that training of rural 
interviewers was likely to prove unsatisfactory unless the material was 
simplified or the length of the training extended. To avoid a longer 
training period, the procedures planned for the rural interviewing were 
reviewed. This called attention to the fact that in checking population 
and housing characteristics a rural interviewer was required to use the 
same questionnaires used by an urban interviewer. The rural inter- 
viewer filled out these questionnaires enly for missed persons and 
missed dwelling units and, for this purpose, it was not necessary that 
the information be any more precise or detailed than that collected for 
enumerated units in the original Census. Actually, this rural procedure 
was a carry-over from the earlier planning before it was decided to 
split the interviewer’s job. Although time was short and the training 
and interviewers’ manuals were already printed, it was decided that the 
cost of revision would be less than the cost (and difficulties) of carrying 
out the original plans. A Census schedule which had been used for self- 
enumeration was, therefore, revised by adding questions regarding the 
usual residence at the time of the Census of presumptively missed per- 
sons and this schedule was adopted for use by the rural enumerators. 
Although the work of revising the schedule and manuals was'Gousider- 
able (and the timing was hectic), the revision permitted simplification 
of the rural interviewer's job and of his training. 


MODIFICATION OF THE SURVEY DESIGN 


Having made the above decisions, work was started on drawing the 
sample, preparing the necessary maps and control lists, designing and 
Printing of questionnaires and drafting of instructions. However, the 
design adopted was to be viewed as only an initial working document. 
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While the survey could have been carried through as initially designed, 
facts brought out in the course of the work dictated important modifi- 
cations. It was, in fact, a conscious policy to maintain flexibility of 
plans up to the point where efficiency required that they be frozen. 

As indicated above, the sample was intended to yield segments of 
approximately 6 households per segment in urban areas and 10 house- 
holds per segment in rural areas. Figures for about the first 20 per cent 
of the segments drawn indicated an average size considerably larger 
than the expected 7.6 households per segment. The complete sample, 
before the subsegmenting described below, had an average size of about 
35 households per segment. About 10 per cent of the segments had over 
35 households with an average of about 300 households. This was due 
to the fact that, in many cases, although there was known to be a large 
number of households in an area, the available maps did not show the 
detail necessary for subdividing the area and it had to be selected as a 
whole (with a lower weight). In most surveys, this difficulty can be 
overcome by subsampling within the segment and interviewing the 
units.in the subsample. For checking coverage, however, it was de- 
sirable to have every household in a segment interviewed. One solution 
would have been to have the Post-Enumeration Survey interviewer 
divide the larger segments аб the time he visited them and cover only 
one randomly selected subsegment. The difficulties of getting the Post- 
Enumeration Survey interviewer to execute this task satisfactorily, 
along with his other duties, made this solution unacceptable. Another 
alternative was to do the subsegmenting as a separate operation with 
the Post-Enumeration Survey interviewers, but before their training 
for the actual interviewing. This was rejected because it meant recruit- 
ing the interviewers earlier than had been planned (which was unde- 
sirable both because it meant disclosing prematurely some of the areas 
included in the Post-Enumeration Survey and because it meant a very 
awkward timing). The solution finally adopted was to do the subseg- 
menting with personnel from the Bureau’s Geography Division, a solu- 
tion wiitth avoided the problem of using the Post-Enumeration Survey 
interviewer and had the additional advantage of doing the job with 
exceptionally well-qualified persons. 

As the sample was drawn and detailed specifications of the procedure 
were developed, more precise estimates of total costs could be made. 
It became evident that total costs would exceed the budgeted amount 
unless the design was modified. Since the purpose of the Post-Enumera- 
tion Survey was to check the Census results, there was a strong commit- 
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ment to obtain the maximum accuracy on a given interview even if this 
meant a relatively high cost per case. This ruled out effecting economies 
at the risk of reducing quality (e.g., by permitting an interview with 
any responsible person instead of requiring an interview with the per- 
son himself). The alternative was to reduce the sample size. The most 
obvious way to cut costs was to eliminate some primary sampling units 
from the sample. The objection to this procedure was that it meant 
either completely re-drawing the sample (serapping the work done to 
date) or ending up with strata of very unequal size which would have 
inereased the sampling error disproportionately to the cost savings. 
However, oversampling of the primary sampling units in the West 
(introduced originally to permit separate estimates for this region for 
agrieultural characteristics) could be eliminated without appreciably 
affecting the variance of national estimates. This decision was adopted 
and reduced the number of primary sampling units to 270. Я 

As originally designed the sample called for interviews with all per- 
sons in the sample households. While this procedure is relatively simple, 
it is not necessarily efficient. Furthermore, on some characteristics, 
(e.g., immigration status, education and income) the Census had ob- 
tained information only for every fifth person so that for these char- 
acteristics the Post-Enumeration Survay sample would have been 
one-fifth the size of the sample for other characteristics (e.g., age, 
birthplace, occupation). In view of the budget situation, it was felt that 
cutting the sample for those characteristics obtained of every person in 
the household would not be too serious. Therefore, those persons for 
whom the Census asked all questions and one-quarter of the remaining 
Persons in the Post-Enumeration Survey sample households were re- 
tained in the sample. ^ 

Another economy was effected by permitting Post-Enumeration Sur- 
vey interviewers to obtain information for missed units from any re- 
sponsible respondent. The aim of the restrictions on the respondent for 
_ enumerated units was to improve on the original Census data. With 
respect to the detailed information obtained for missed unils;it was 
d necessary to attain the accuracy level required in the original 

ensus. 


RECORD CHECKS 
The design of the Post-Enumeration Survey emphasized securing 


results of high accuracy. Nevertheless questions were raised about the 
validity of the Post-Enumeration Survey: How can one be sure this 
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technique would give a better answer? How can you check the errors in 
one survey by taking another? This line of questioning frequently 
ended with a suggestion that there were records (presumably authorita- 
tive) which would give a better check than a re-interview. 

However, almost all record-matching studies end with unmatched 
cases and cases where the match is questionable. The errors found for 
the matched cases do not provide an estimate for all cases unless one is 
willing to make arbitrary assumptions with regard to the errors among 
unmatched cases. Even where a match has been established, one cannot 
conclude that the records are necessarily accurate. Where a birth cer- 
tificate was filed soon after the birth, the date recorded on it is, in gen- 
eral, dependable. Where it was filed several years later, one might have 
considerably less confidence in the recorded birth date. Тһе accuracy of 
record data varies from one source to another and also from item to 
item on the same record. А record in the Veteran's Administration files 
may provide conclusive evidence that John Doe is a veteran but not 
that he was born on the date stated on that particular record. 

Тһе above considerations sharply limit the utility of record checks. 
Where errors are large for the matched cases, however, a record check 
provides a valuable warning of unreliability of data and useful leads for 
further investigation of sourges of error. 

It was decided to experiment with record checks for age, birthplace, 
citizenship status, highest grade of school completed, income, veteran 
status, and industry in which employed. In pretests of the Post-Enu- 
meration Survey, additional data were included which might be useful 
in checking these items, e.g., an inquiry regarding branch of service and 
serial number for use in checking veteran status. In completing the 
matches for the pretest data, our attention was called to additional 
items which would improve the efficiency and the accuracy of the 
match. These items were added to the Post-Enumeration Survey ques- 
tionnaire. After collection of the data, the check of highest school grade 
completed was dropped on the grounds that: the effort involved would 
have ‘seen out of proportion to the expected success. 


IMPLICATIONS FOR FUTURE RESEARCH 


Although the Post-Enumeration Survey possessed certain distinctive 
features peculiar to its purposes and content, its history illustrates the 
complex decision process involved in survey planning. This decision 
process requires the balancing of cost and accuracy considerations both 
within a particular aspect of a survey and among different aspects. The 
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over-all cost of a survey and the accuracy of the estimates the survey 
provides are resultants of the costs and accuracies associated with a 
variety of processes—sample selection, interviewing, coding, tabulat- 
ing, etc. Since most surveys have a fixed over-all expenditure limit, 
inereased expense to improve the accuracy of one phase will usually 
require curtailed expenditure and lowered accuracy in some other 
phase—for example, may require a decrease in sample size with conse- 
quent increase in variance. Between survey processes a balance is nec- 
essary; to achieve a reduction of 5 per cent in coding error at the ex- 
pense of decreased checking of the interviewing with an attendant in- 
erease of 10 per cent in interviewing error is hardly desirable, particu- 
larly if the interviewing error is already larger than the coding error. 
Within a given survey process—for example, interviewing—an in- 
creased expenditure for training must be balanced against an increase 
in expenditures for supervision, interviewer salaries, more detailed 
questions, etc., in terms of the relative effect of the various expenditures 
on the quality of the interview output. 

In the process of balancing errors and costs between different survey 
operations, it is important to distinguish between the “variable” errors 
which decrease with increasing size of sample and the biases—which 
are unaffected by the sample size. A suryey design may be character- 
ized by its sampling variance and bias and, in addition, by the variance 
and bias arising from other sources, such as interviewer, respondent, and 
processing error. By “nonsampling variance” is meant the variability 
of errors around the average error which is the bias. Depending upon 
the relationship between variable errors and biases, increased expendi- 
ture per case may or may not lead to a decrease in the over-all error. For 
example: Suppose that Method A with an expenditure of $1.00 per case 
would, if applied to the whole population, give an error (bias) of .2 
year in an estimate of average schooling and if the estimate were 
based on a random sample of a single case, give an (expected) error of 
10 years. Assume, also, that Method B with an expenditure of $20 per 
case would give an error (bias) of .05 year if applied to the whole 
population and of 8 years if applied to a single (randomly selected) case. 
The difference between the two methods arises, let us say, from greater 
expenditures for training and checking on all survey processes in the 
case of Method B. If cost were not a factor, Method B would be prefer- 
able. In most cases, cost is a factor and operates differently depending 
upon the total funds available; using a simple random sample, Method 
A would give a mean square error of .060 for a total survey budget of 
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$5,000 compared with a mean square error of .258 for Method B. With. 
а total survey budget of $100,000, the relationship would be reversed, 
Method A showing .041 mean square error and Method В .015.1 

In planning a survey, it is not sufficient to know that one technique 
gives more accurate results for individual cases than does another tech- 
nique. The amount of the improvement and the relative costs of the 
two techniques must also be known. In addition, the relationship 
among the different phases of a survey may be extremely important. 
Since the error in a survey estimate is not a sum of the errors in the 
different survey operations but tends rather to be the square root ofa 
sum of squares of errors, reduction of a given amount in a relatively 
large source of error has more impact on total error than an equal re- 
duction in а small source. For example, reducing the response bias in 
an estimate of average income from $250 to $200 will have more effect 
on total error than reducing the sampling error (standard deviation of 
the estimate) from $100 to $50. 

Reasonably satisfactory techniques are available for assessing the 
gain of a given sampling design in terms of its effect upon cost and 
accuracy, and for evaluation of processing operations. At present, tech- 
niques available for the assessment of a given interviewing procedure 
ате not nearly so satisfactory, In some cases there may be evidence that 
one interviewing method gives greater accuracy than some other 
method. However, in almost all cases, we have had no specific determi- 
nation of how much improvement in accuracy is obtained by a given 
interviewing technique and cannot, therefore, determine whether the 
gain in accuracy justifies the increase in cost. Although present meth- 
ods for measurement of the error of an interview may fall far short of 
perfection in giving an objective evaluation of quality, they can often 
provide relative measures of reliabiljty in the form of lower bounds. 
Such data are useful and usable in improving survey design. They also 
will provide a basis for improving techniques of measurement. The de- 
velopment of more satisfactory techniques for the measurement of in- 
terview.error, and collection of data for the evaluation of specific survey 
methods, constitute in our opinion the most essential steps to be taken 
toward the improvement in survey design. 
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ON THE DISTINCTION BETWEEN ENUMERATIVE 
AND ANALYTIC SURVEYS* 


W. Epwarps DEMING 
Bureau of the Budget and New York University 


I. DEFINITION OF THE ENUMERATIVE AND ANALYTIC USES OF DATA 


urpose of the paper. Statistical data are supposedly collected to 
P provide a rational basis for action. The action may call for the 
enumerative interpretation of the data, or it may call for the analytic 
interpretation. 

The aim here is to exhibit some of the consequences of failing to dis- 
tinguish between the enumerative and the analytic uses of data. This 
distinction is necessary in the statement of the aims of a survey, census, 
or experiment, in order that the plans for the collection of the data and 
for the tabulations may most economically meet the needs of the con- 
sumer, and it is equally important in the interpretation of data. 

Thus, to draw on a result from a later paragraph, information ob- 
tained in a complete census concerning every person in an area (e.g., 
on occupation, income, or education) still possesses for analytic pur- 
poses а sampling error that is actually about a quarter as great as the 
samplitig error of a 6 per cent sample. The consequences are far-reach- 
ing. In using a census-table for analytic purposes, even though the 
figures come from a perfect complete count, it is therefore necessary 
to bear in mind that small numbers in a cell are unreliable in the sense 
that they have a standard error, just as if they had arisen in sampling, 
as indeed they did. Moreover, in the planning of a complete census, it 
is therefore imperative to use sampling for every bit of information 
that is not necessary as an aid to complete coverage, or required to give 
detail for small areas (such as the block statistics). Name, relationship 
to the head, age, sex, marital status, color are probably all necessary 
for the sake of completeness of coverage. These things, plus a few ques- 
tions on rent, tenure, year built, will provide the information required 
for the block statistics. 

To draw on another result, we shall see that it is often impossible to 
design a survey that will supply economically information for both 
enumerative and analytic purposes. For example, in a marketing sur- 
vey, the best design for an estimate of the number of people who prefer 
to use ground coffee at home, rather than soluble coffee, requires, for 


* Delivered at а conference on sampling conducted by the Institute of Statistics, University of 
North Carolina at the Blue Ridge Assembly, 21 July 1952. 
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greatest economy, one type of sample design; whereas a study of the 
reasons, or even of the difference in the two proportions, requires an- 
other design. One must be prepared to make some sacrifices in preci- 
sion, as it may not be economical to satisfy both aims simultaneously. 

The distinction between the enumerative and analytic uses of data.’ 
Briefly, the enumerative question is how many? The analytic question 
is why? is there any difference between the two classes, and if so, how 
big are the differences? 

In the enumerative problem, some action is to be taken because the 
frequency of some particular characteristic of the universe is found to 
exceed some critical value. The crop of wheat, according to a sample 
survey, turns out to be large or small. As a consequence, the market 
goes down or up, and production of meat, cereal products, and of sub- 
stitutes shifts one way or another because of this information. The 
Census, or perhaps a sample study of birth registrations, shows that 
in a particular city the number of children in the primary schools will 
be much greater in 4 years than now. Bonds are issued; work com- 
mences on a new school building. Inspection of a sample of wool or 
of cotton may determine its disposition, the price to be paid for it, 
and what kind of cloth and of garments to make of it. Inspection of 
a lot of industrial product determines whether it will be accepted or 
subjected to screening or to a lower rating, or outright rejection. 

Such problems are enumerative because they depend “purely on a 
determination of the number of people in an area, or the inventory 
of grain, or the production of grain, or the quality of a product. They 
do not involve the analytic question of why all these people are there 
or why the crop this year is what it is; or why the wool or cotton or 
product is 80 good or so bad, 

When certain cities in America were swelled with in-migrants be- 
cause of war productién in the spring of 1944, special censuses were 
taken with the aim of arriving at equitable allocations of food, gaso- 
line, repair parts for buses and trolley cars, and other necessaries of 
living. Equitable distribution of supplies to these cities was impossi- 
ble because no one knew just how many people were in them: assertions 
of editors and chambers of commerce did not provide a basis for ac- 
tion. The problem was enumerative because the action (viz., allocation 


_ | These are the terms that I invented for Chapter 7 іп my book Some Theory of Sampling (John 
Wiley, 1950). The terms are not important; the concepts are, The concepts are old, but plain statements 
of what they are and of the consequences of failing to keep them in mind in design and analysis are not 
easy to find. Similar but not exactly parallel concepts occur in the analysis of variance, under the terms 
Model I and Model II, a lucid explanation of which occurs in the paper by Churchill Eisenhart, “The 
assumptions underlying the analysis of variance,” Biometrics, 3 (1947), 1-21. 
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of food and materials) depended on how many people were there and 
not why they were there. 

By law, the Social Security program is partly an enumerative prob- 
lem because federal reimbursement to a state depends on the number 
of inhabitants 65 and over within the state. Public health programs, 
agricultural adjustments, and other allotments depend on population 
and acreage, and are examples of enumerative uses of data. Adminis- 
trative problems concerned with the long-range aims of these programs, 
however, are analytic. 

Tn the analytic problem, the action is to be directed at the underly- 
ing causes that have made the frequencies of the various classes of the 
population what they are, in order to govern the frequencies of these 
classes in time to come. Familiar examples of analytic studies are found 
in intelligent city-planning. More familiar studies are the differential 
effects of varieties and of treatments in agriculture and entomology. 
Тһе particular crops that are measured are of interest only because 
they aid decisions on what varieties and treatments to use for the 
best results in crops yet to be planted. We may run an experiment 
with a group of test animals or with patients in а hospital, but when 
we generalize from these tests we are thinking of the production 
process: what will it produce in the future? The present tests are 
important only because they help us to prescribe or to modify the 
treatment for future use. The control-chart is a splendid example, the 
purpose being to control the production process and the quality of lots 

yet to be made. Other examples are medical and social studies wherein 
interest centers in the causes that produce differences in health, fer- 
tility, or death-rate in different segments of a population of people. 
Current population surveys in the United.States and Canada aid stud- 
ies of employment, unemployment,:farm and industrial labor, school 
attendance, etc. The monthly sample of deatlis by causes, published 
_by the National Office of Vital Statistics in the United States, aids in 
the control of epidemies and the spread of disease. Its use is both 
enumerative and analytic. 


Special reference to the statistical control of quality. Both the enumera- | 


tive and analytic problems present themselves hourly in the statistical 
control of quality. A batch of product has been produced, let us sup- 
pose, and the machine is already producing another batch. Two ques- 
tions arise: B (analytic). Shall we leave the machine alone, or shall 
we adjust it? Shall we make it run slower or faster, or shall we change 
the type of chemical bath? A (enumerative). What shall we do with 
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the batch of product just made? Shall we send it on to the next opera- 
tion (which might be into the consumer's market, or into the next bay 
of the same factory for further work)? Or is the product so defective 
that we must re-work it, sell it as second-class, or scrap it? 

A chemical engineer whose specialty is the production process may 
have a special interest in Problem B, and little in Problem A, which he 
leaves to someone else. On the other hand, if we are the purchaser of a 
batch of product, such as a single automobile, or some paint for our 
home, or a carpet, we certainly have a special interest in Problem A. 
We wish to know the quality of this particular batch of product. It is 
little comfort to know that the process by which it was made was a 
good one, and was in a fine state of statistical control, if the product 
that we ourselves purchase turns out to be defective and unsuited to 
our purpose. A manufacturer, on the other hand, must purchase raw 
materials and assemblies in quantities, week in and week out. In order 
to cut the costs of these materials and to improve their quality, he 
must concern himself not only with Problem A, the inspection of these 
materials upon receipt; he must in addition take a lively interest in. 
Problem B, the control of the production processes in the plants of his 
suppliers. 

The methods of the Shewhart control chart are essentially analytic, 
as they tell when to take action on the process. In contrast, the meth- 
ods of acceptance sampling are primarily enumerative, dealing with 
the disposal of a lot, although they react secondarily on the process by 
forcing better control where needed. 


П. THE SAMPLING VARIANCES FOR THE TWO 
TYPES OF DISTRIBUTION 


The two uses re-stated in terms of sampling distributions. Re-stated in 
terms of a mechanism for carrying out the sampling, we may distin- 
guish between the two uses (enumerative and analytic) by considera- 
tion of the two distinct types of repetition of the operations that lead 
to two distinct sampling distributions. In the enumerative case, we 
take repeated random samples from the same lot, and seek the sam- 
pling distribution of the mean or of other statistical measures of these 
Samples. In the analytic case, we take repeated random lots from a sup- 
ply or cause system, and we select a random sample from each lot; then 
seek the sampling distribution of the mean or of other statistical meas- 
ures of these samples. ў 

The use to which the data will be put determines which of the two 


248 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1953 


types of repetition is applicable in any one problem. Unfortunately, 
sometimes we require data from the same survey to serve both pur- 
poses. 

Tt is helpful to look at a diagram. The figure shows three bowls with 
poker chips, all physically similar, some red and some white. By stir- 
ring the contents of any bowl thoroughly, and reaching in blind-folded, 
it is possible to satisfy satisfactorily the conditions for a random sam- 
ple. Another way is to give serial numbers to the chips and to draw 
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CAUSE SYSTEM OR Lor 
PRODUCTION PROCESS SAMPLE 
Initially 5 Initially 
Mp red NP red т red 
^ Mg white NQ white n—r white 
M total N total n total 


them with random numbers. The bowl on the left represents the process 
or cause system. It is a supply of chips. The bowl in the middle repre- 
sents the lot. It is the people in an area today, or a batch of product, ог 
а стор. The lot has come, we suppóse here, as a random sample from 
the process. This assumption is over-simplified; nevertheless it is & 


first step to an understanding of analytic problems. The small bowl at 


the right represents а sample drawn from the lot. 

The four possible different variances. Now we are able to state the 
two problems in terms of estimation. In the enumerative problem the 
sample is used for an estimate of the contents of the lot, which is de- 
seribed by the proportions Р and 0. In the analytic problem the sam- 
ple is used for an estimate of the contents of the supply; which is de- 
scribed by the proportions p and 4. The same sample serves both pur- 
poses, but not equally well, for the two estimates have different 
variances. Hence the proper size of sample, and how to interpret 8 
sample, will depend on whether the aim is enumerative or analytic. 
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The variances of r/n as estimates of p and of P are shown in the ac- 
companying table. There are four cases, А, B, C, D, depending on how 
the lot and the sample are drawn? There are two interpretations 
(enumerative and analytic) in each case. 


Table of the variances of 2 and of Р 
In all cases, Ep=p and EP =P 


The N balls in the lot- The sample of л is drawn from the lot-container 
container are drawn 
from the supply With replacement Without replacement 
. Case А Case B 
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APPLICATIONS 


Tabulation plans. The two variances, Var pand Var P, are different. A 
sample therefore contains different amounts of information for the two 
purposes. How do these observations help us in the design of samples? 
They tell us that if there is a definite enumerative aim in finding out 
how many people there are with а given characteristic however rare, 
then the tabulation and printing of small cells may be justified, pro- 
vided the universe will still retain enough of its characteristics by the 
time the sample is tabulated. 

On the other hand, if the aim is analytic, there is the sampling error 
75 even if the figure comes from a complete census, and this error may 
become troublesome in small cells. It will then be well to economize by 


B aur the derivations of the variances see the author's Some Theory of Sampling (John Wiley, 1950), 
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using sampling in the collection of such data, and to determine in ad- 
- vance what consolidations may be made in the tabulating and in the 
printing. Much space and money are wasted annually on the tabulation 
of cells that are too small for analytie use, and which have no enumera- 
tive use. Too often the excuse for tabulating small cells in а complete 
census is that they came from a complete census and must therefore 
be correct. When the use of such tables is analytie only, such arguments 
do not hold: a reconsideration is due. 

Case B is one that often corresponds approximately to many prob- 
lems in real life. For enumerative use we take the proportion r/n in the 
sample to estimate the proportion P in the lot. The variance of the es- 
timate 


Pu (1) 
т 
is seen іп the table to be 
N—nP 
uin c em Q) 
.N—1 n 


which reduces to 0 if the sample is increased to а complete census 
of the lot, when n— N. 

In contrast, for analytic purposes in Case B we use the proportion 
r/nin the sample to estimate the proportion p in the supply. The vari- 
ance of the estimate 


© М 
ф= — (8) 
п 
is seen in the table to be 
Vans GEH (4) 
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The size N of the lots, although they furnished the samples, does not 
enter into this variance at all. The size N only limits the size of the 
sample: it cannot be bigger than the lot. To reach greater precision 
than pg/N (the variance of a complete census) we must draw another 
lot and sample it also; then combine the two estimates of p. 

Effectiveness of a medical treatment. We may see from the following eX- 
ample the two possible ways of interpreting data. Three hundred 
ninety-eight patients in a hospital went under treatment for a certain 
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ure the odds that there are 0, 1, 2, etc., recurrences in the pde i 
3 patients? Stated another way, what i е the highest number of recur- 
ne ев їп Ше original 398 that would permit such. а result ав often as 


I К ош series. 
Let K be the number of patients in the original 398 who Anm 


308 — K 
( 250 (0) 

E 7468 

(ш 

L398-K398-K-1 308 — K — 

308 397 396 


2 
etc. to 250 fractioñs. (5) 


K-NP P P, 
. 

0 T 0 1 
3 .008 0.051 
4 .010 .019 
5 .012 .007 
6 :015 .002 
7 .018 2 3001 


ү suppose that the problem is to predict the proportion that 
ША be cured in a succession of lots of patients. This is an analytic 
estion, and the theory to use here then is the binomial series. If p 
he proportion of recurrences in the general population (from which 
assumption now the 398 patients is à random sample), then the 
robability of the observed result is 
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It is interesting to note that the size 398 of the lot does not come into 
this probability at all. 

А few results are in the table here, whence by interpolation we may 
conclude with odds of about 19:1 that the proportion p of recurrences 
in the general population would not be more than 12 in 1000. 


? P. 

0 1 
.005 .286 
.010 .081 
.015 .023 
.020 .006 
.025 .0002 


This example, though oversimplified, may help to guide the design 
and interpretation of the results of samples. It shows specifically how 
the interpretation changes when we change our aims from the enumera- 
tive to the analytic use. 

Allocation of sample. In tlie symbolism just introduced, the analytic ' 
aim is to measure the difference between the two proportions pı and p: 
which exist in two cause systems, or to find out if there is any signifi- 
cant difference between p; and ps. We cannot examine the cause sys- 
tems directly; we can only study two groups of farms, plots, patients, 
or pupils that the two cause systems have produced. That is, we can 
only study two lots, one from one cause system, and one from another. 
We shall assume that Case B fits the actual events. 

The lots may be of different sizes, Уі and №,. At any rate, we take 
samples therefrom of sizes n; and ms, and we ask what should their 
sizes be to minimize Var (ёг — 1), an analytic purpose. The optimum - 
allocation for this purpose requires that 


m = Ко = Бурі (7) 

m = Коз = Кур: (8) 
where 

k = n/(o1 + оз). (9) 


Now usually, if not always, such problems require the aid of statistical 
techniques only if p; and p; are not far apart; if the difference between 
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them were wide, we could observe it without help. Hence we may prop- 
erly take 


N = Mm 


as optimum. Thus, the usual practice of taking equal sizes of sample 
for clinical or laboratory tests is correct for minimizing the variance 
of the estimated difference, regardless of the sizes of the populations or 
of the acreages of the crops whence the samples were drawn. 

In contrast, for the enumerative purpose of estimating the over-all 
average 2 or total X of some characteristic (average rent, total number 
of unemployed, total acreage in wheat) in the two lots of size N, and 
Nz, the aim is to minimize the variance of 2 or of X, an enumerative 
purpose. To this end, the optimum size of sample will be 


nı = nNi/N 


aN, "d [for proportionate allocation] (10) 


та 
or, if one prefers, 
m = kNyoy 


\ [for disproportionate allocation] (11) 
Тә = kNooe, 


е 
p= n/(Ni01 + Моз). е 14 


Obviously, the optimum allocations for the two purposes will be dif- 
ferent except when the two lots М, and N2 are nearly equal in size. 

Unfortunately, the purposes of a survey are often both analytic and 
enumerative. Іп a survey to assist the marketing of a certain brand of 
frozen orange juice, we need, to know not only how many people of 
various income levels buy frozen orange juice of a particular brand, but 
of all brands, and tinned unfrozen juices as well, and probably fresh 
fruit besides. These are enumerative counts. Then also, probably more 
Important, the survey must discover why people of various groups buy 
or do not buy frozen orange juice and the products that compete with 
it. This kind of question is analytic. The design that is economical for 
one type may not be economical for the other. 

In another study, a research worker wishes to study the variation in 
the behavior of people, classified by age, education, marital status; 
Perhaps also by religion, urban and rural residence, and occupation. 

Tequencies are important; but so also are the contributing causes. 


Qu this research presents also both enumerative and analytic prob- 
lems, 


where 


254 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1953 | 


One way out is to conduet two surveys—one for enumerative pur- 
poses to ascertain the frequencies of certain behavior by class; an- 
other, with a more intensive questionnaire, to study the causes. 

Another solution is to make some sort of compromise in the design, 
sacrificing economy and precision in (e.g.) the enumerative results, in 
order to gain something for the analytic uses. How much to sacrifice, 
how far to lean, and which way, сап be settled only with consideration: 
of the risks and of the losses of making a wrong decision on the basis 
of information not sufficiently precise, and on consideration of the ad- 
ditional cost of getting more precise information. 

A note on acceptance sampling. The probabilities that one encoun- 
ters in the analytic problem in Case B justify the customary 3-sigma 
control limits in the form 


ЗК (Ris the average range over a series of samples, 
dn апа R/d; is an estimate of c) 
for the z-chart, or 


2 + 


Piu Vas (p is the average fraction defective over a series 
^ п of samples) { 


for the p-chart, as an aid for detecting uncontrolled variability. It will 
be observed that the size N of the lot does not appear in these equations 
even if the sample-size n is 100 per cent of N. This form of computa- 
tion is now seen to be correct; it is not an approximation. The justifica- 
tion is the absence of N and of any finite multiplier in the analytic 
Case B. 

On the other hand, many writers in doaling with the producer's risk 
in acceptance sampling (the probability that a 106 of acceptable qual- 
ity will be rejected) have recommended hypergeometrie terms (like 
that in Eq. 5), or rather have reluctantly used binomial terms as ар- 
proximations to the hypergeometrie terms. Actually, however, the pro- 
ducer is concerned with the problem of keeping his process in control 
and at a desired level p. The quality P of the lots that he produces will 
vary from lot to lot, yet the risk (probability) that a lot will be rejected 
on a single-sampling plan turns out to be а sum of binomial terms typi- 


fied by 
т 
( ) qp, 
T 


into which the size N of the lot does not enter at all. This problem be- 


D >> 
> 


| 
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haves like the analytic Case B, for which the binomial terms are cor- 
rect; they are not approximations. 

The consumer's risk is another story. Тһе consumer is concerned with 
the particular lots of product that he is purchasing, and he has a sam- 
ple from each lot on the basis of which to decide whether to accept or 
to reject the lot. He may aim to guard against accepting lots with too 
high a value of P, regardless of р and of the state of control. To com- 
pute on this basis the correct probabilities for the consumer's risk, one 
requires hypergeometric terms. The finite multiplier (N —2)/(N —1) 
then appears in the variance of P, because this is the enumerative Case 
B. 

There are thus, strictly, two operating-characteristic (O.C.) curves, 
one for the producer, another for the consumer. In practice, however, 
except when the sample is 20% or more of the lot, the two curves co- 
incide, almost, fortunately, and one curve suffices. 

The distinction made here between the different probabilities for the 
producer’s and consumer’s risks is not new. It forms the basis for the 
Dodge-Romig tables, as is clear from their text. An extremely lucid ex- 
position appears in the book Sampling Inspection by the Statistical Re- 
search Group, Columbia University (Moti Till 1947), pages 183 
and 184. I quote: 

There are two alternative ways of interpreting “percentage of defective 
items in submitted product,” and these lead to somewhat different О.О. 
curves for small inspection lots. (a) The percentage of defective items can 
be considered as applying to each inspection lot separately.... (b) The 
percentage of defective items сап be considered as applying to ‘the ргос- 
евв.... If interpretation (b) is adopted, the resultant О.С. curve does not 
depend on the size of the inspection lot. 

е 
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CONFIDENCE INTERVALS FOR THE NUMBER SHOWING 


А CERTAIN CHARACTERISTIC IN А POPULATION 
WHEN SAMPLING 18 WITHOUT REPLACEMENT 


Leo Katz 
Michigan State College 


I. SUMMARY 


N A number of different sampling situations, we encounter a problem 

which may be stated as follows: From a population of N objects, 
each of which possesses or does not possess a certain characteristic, we 
select without replacement a random sample of n objects. Observing that 
m of these possess the characteristic in question, we wish to estimate the 
total number M having the property. This problem arises in sample 
surveys of human populations and in sampling inspection of manu- 
factured articles as well as in other, less obvious, situations. 

It is well known (see e.g., F. F. Stephan’s solution, quoted by Dem- 
ing [2, p. 294]) that the maximum likelihood estimate for M is the 
largest integer in (т/п) (№М--1). The problem of confidence interval esti- 
mation for M, however, seems to have escaped systematic investigation 
in the literature. In this paper, the problem is considered and two ap- 
proximate methods are given for construction of these confidence inter- 
vals. On the basis of some preliminary investigations, it appears that 
the second approximation, which allows for correction for discontinuity 
in the observations, agrees quite closely with the exact computations. 


II. FORMULATION OF THE PROBLEM 


The roughly similar problem of confidence interval construction fof 
the parameter т in an infinite population or where sampling is done 
with replacement has been exhaustively studied by Clopper and Pear- 
son [1]. When М is very large, confidence intervals for M = Ут may be 
obtained in terms of the well-known results for т. When, however, N is 
small or moderately large, this approximation is not useful and exact 
methods or better approximations are needed. 

Consider the statement of the problem as one involving a fourfold 
table. There are two groups of n and (V—n), respectively. We observe 
that m of the n possess a certain characteristic. Assuming the grouping 
to be arbitrary and random, we wish to estimate the total number, M, 
in the combined groups, possessing the characteristic. Accordingly, We 
have the fourfold table, 
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m n—m n 
М-т М-М-т-т М-т 
М N-M NAM 


where m, n and N are known at the outset, and M is to be estimated. 
It is well known, due to a result of R. A. Fisher given by Yates [6], that 
the exact distribution of m is hypergeometrie with probabilities given 
by 


n\(N — »)MYN — M)! 45 
NImYn — т)!(М — m)\(N — M – п + m)! 


(0 P,{m| N, n; M} = 


where the semicolon before M emphasizes that N and n are known 
numbers while M is a fixed but unknown parameter in our problem. 

The known exact distribution (1) defines implicitly confidence inter- 
vals for М in terms of the fixed values of n and N and the observed 
value of m. Тһе explicit construction of confidence intervals requires 
the existence of tables of the distributions; it is at this point that the 
difficulties occur. Finney [3] gives percentage points of the distributions 
for n and (N—n) up to 15 each. The author [5], in unpublished tables, 
has computed the cumulative distribution functions for all combina- 
tions with N up to 24. Thus, except for extremely small numbers, the 
exact confidence intervals are not available. We turn, therefore, to ap- 
proximations. 

Chung and DeLury [7] have recently published a set of charts! giving 
90 per cent, 95 per cent and 99 per cent confidence intervals for N — 500, 
2500 and 10,000 with sampling rates of .05 and .1 by tenths to .9. 
Interpolation for other. population sizes using 1/4/N as argument is 
Satisfactory; interpolation for other sampling rates than those specified 
is inconvenient, because neighboring values are read from different 
charts. Extrapolation beyond N =10,000 is reasonably accurate. 

This leaves a gap between about N=25 and N=500, not likely to 
be filled by exact computation. The approximation given here may be 
useful in bridging this gap. For example, it might be possible to obtain 
approximate hypersurfaces for upper and lower confidence limits by 
this Means, and then to devise systematic corrections in the direction 
of relatively few exactly computed points. 


*See notice in this Journal, 46 (1951), 394-5. 
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In the paper by Yates [6] to which reference was made above, it was 
shown that the exact distribution of m may be approximated by a re- 
lated variable having the x? distribution with one degree of freedom. 
(x, then, is а unit normal variable.) Yates further points out that, for 
relatively small fourfold tables, the approximation is much improved 
by correction for continuity, ie., for the replacement of the discrete 
hypergeometric variable by a continuous variable. x? in the fourfold 
table above is given by 


N(mN — Mn)? 


@ им СМ). 


Tf, in (2), the value of x? at the 100a percentage point and the known 
values of the other quantities are inserted, the resulting expression is à 
quadratic equation in M. The discriminant of the quadratic form being 
positive except in the trivial case, n —N, the two real roots are easily 
interpreted. The smaller root, Му, is the value for which x(m) is at the 
upper 50a percentage point and, hence, m is approximately at the 
upper 50 percentage point of its (exact) hypergeometric distribution; 
a similar statement holds for M». The interval, Mı SM SM., is then a 
confidence interval for M with confidence coefficient (1— о). 

In what has been said immediately above, no allowance has been 
made for the continuity correction. Since, for the smaller root, m is in 
the upper tail of its distribution, the appropriate correction for con- 
tinuity (following Yates) is subtraction of 1/2; for the larger root, We 
use 74-3. Thus, the confidence intervals for M, without and with cor- 
rection for continuity, are, respectively, ' 


N 
make Dx (2m Td n) 


-y+ m zem: + (n — m)?] + (2m — wl 
@) 


t N 
SM S—|ka 5 
M PAL + (2m — n) 


F ka? = = balm! + (n — m)?] + (2m. — E 


(uncorrected) 
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and 

N 

n R + (2m — n — 1) 

-A/ bet ~~ kim = + (n= m+ + Gm n - A 
(4) N 

аша [ke + @m =m +1 

+4/ bet T kem +9) + (n — m — 3] + Qm—n+ il 


(corrected) 
where ka=n-+(1—n/N) x2. 


IV. SOME EXAMPLES 


The three examples given below are chosen for their dissimilarity; 
they are not intended to be comprehensive. 

(i) In a very small sample inquiry, we ask nine persons, randomly 
selected from a group of 100, whether they are in favor of a certain 
proposal and we find three in favor, six opposed. We wish to construct 
a 95 per cent confidence interval for the number, M, їп the whole group, 
in favor of the proposal. Using (a) equation (3), (b) equation (4), and 
(0 exact hypergeometric methods, we obtain the following confidence 
intervals: 


(a) 13 € M S 63 (uncorrected approximation) 
(b) 10 < M = 67 (corrected approximation) 
(c) 9 < М = 68 (exact) 


The interval (b) actually almost included the possibility of 68, since the 
cutoff value was 67.97. The Clopper and Pearson chart would indicate 
M between 7 and 72 inclusive. 

(@) This example is somewhat unrealistic and is chosen only to illus- 
trate the use of existing tables and to point up the desirability of ex- 
tending these tables. In this case, having observed that three out of 
eight randomly chosen from a group of 20 exhibit a characteristic, we 
Seek a 90 per cent confidence interval for the number M, in the complete 
group of 20. Here, the x? approximation, corrected for continuity, gives 
8.028 <M <13.042. These values are in excellent agreement with the 
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exact computation based on the following values from the author's 
table [5]: 


Р,\т > 3| М = 3} = .0491 
P,{m = 3| М = 4} = .1531 
P,{m < 3| М = 13} = .0521 
P,{m < 3| М = 14} = .0181 


Examination of these probabilities indicates that, on the one hand, 
M =3 is almost acceptable and, on the other, M = 18 is barely accepta- 
ble at the 90 per cent level. 

(iii) Grant [4, pp. 412-14] considers in some detail a series of ac- 
ceptance sampling plans in which 5 per cent samples are taken from 
lots of 2500 items. In this series of plans, the acceptance numbers are 
0, 1, 2, 3 and 4, respectively. We now wish to construct 95 per cent 
confidence intervals for the number of defectives in the lot of 2500 
when we observe 0, 1, 2, 3 or 4 defectives in the sample of 125. The 
table below gives confidence intervals obtained by the corrected x? ap- 
proximation. 


95 per cent confidence intervals for number of defectives 
in lot of 2500 when sample size is 125. 


Number of defectives Confidence interval 


in sample (inclusive) 
0 0- 89 
1 2-122 
3 EE 8-152 
3 16—180 
4 27-207 


Similar computations were carried out in this case for the normal ар- 
proximation to the binomial approximation to the hypergeometric dis- 
tribution. In every instance, a somewhat wider confidence interval was 
obtained; the results, however, were in excellent agreement in the sense 
that the intervals in the table above fell wholly within the second set. 

In the case of the last example, it was possible to compare the confi- 
dence intervals obtained by our procedure with those given by the 
Chung and DeLury charts. For each of the five acceptance numbers 
above, the Chung and DeLury confidence interval partially overlap? 
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the interval given above. For the acceptance number 3, for example, 
their confidence interval is (12, 170) as against our (16, 180). Exact 
computation of the lower confidence limit places it at 14, however, 
while the exact upper limit is at 169. 
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THE UP-AND-DOWN METHOD WITH SMALL SAMPLES* 


К. A. BROWNLEE, J. L. Hopazs, JR., AND Murray ROSENBLATT 
University of Chicago 


I. INTRODUCTION 


нЕ up-and-down method for estimating the 50 per cent response 

point of quantal data was originally devised for testing the sensi- 
tivity of explosives. Dixon and Mood [6] suggested that this method 
might have advantages in other fields of application. They gave ap- 
proximate maximum likelihood estimates for the parameters of the 
(normal) response curve, and approximate formulas for the standard 
errors of these estimates. They pointed out that, on the basis of formu- 
las for the asymptotie variance, the up-and-down method was 30 or 
40 per cent more efficient than the usual probit analysis method. 

In spite of this efficiency advantage, the up-and-down method does 
not seem to have been given much consideration in such fields as bio- 
assay [10, p. 909] or fatigue testing of metals. It has suffered from two 
restrictions which, taken together, have greatly limited its usefulness 
in many experimental situations: 

(a) The trials must be made sequentially; before each trial is started, 

' the response to the preceding trials must be known. 


(b) The efficiency advantage has not been explored for small sam- | 
ples: Dixon and Mood express the fear (p. 112), that “meas | 


ures of reliability may well be very misleading if the sample size 
is less than forty or fifty." 


Thus, the minimum duration of the experiment would have to be forty 
or fifty times the mean time required for response. This disadvantage 
would, in many cases, more than outweigh the efficiency advantage. 
The main content of this paper is a report on computations made 
to determine the actual performance of some estimates for the mean 
dosage parameter и, based on up-and-down series of length 10 or less. 
It is found that the Dixon-Mood formula for the asymptotic variance 
is reasonably reliable even in samples as small as 5 to 10. Thus, 167 
striction (b) is not necessary. As a consequence of this fact, the design 
of up-and-down experiments may be altered, so that several inde- 
pendent series are run simultaneously, without serious loss of accuracy: 
In this way, restriction (a) can be considerably reduced. Finally, the 


* This work was supported by the Office of Naval Research. 
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parallel introduces a flexibility into the design, which permits us to 
take advantage of special features of some experimental situations with 
a still further increase in efficiency. 

We have not considered the problem of estimating the scale parame- 
ter c. The reason for this is partly that и is usually the parameter of 
greater interest, but primarily that with small samples no estimate for 
с can be accurate enough to have much value. Even if џ were known, 
and even if the trials are conducted at stimuli giving the most efficient 
estimation [9], over 200 trials would be required to estimate o within 
20 per cent with confidence of 95 per cent. Our experience is that in 
most experimental situations, the scale parameter is sufficiently stable 
that the experimenter can guess its value in advance from past ex- 
perience more accurately than he can estimate it from a small sample 
[2, p. 476]. Fortunately, our procedures require only that с be known 
within rough limits, and the performance of the estimates for д are not 
sensitive to errors in the guessed value of о. 


IL. THE ESTIMATE 5 


We assume that the stimulus scale has been so chosen that the re- 
sponse curve is an integrated normal curve with parameters и and c. 
That is, the probability of response P to stimulus y is given by 


(a)l 1 ° 
P= f —— g^ tdg, 
— М2т 
We fix a system of equally spaced stimuli, yotid, i=0, 1, 2, - · - , and 


perform the first trial at stimulus yo. Each subsequent trial is performed . 
at stimulus d units below or above that of the immediately preceding 
trial, according as the immediately preceding trial did or did not evoke 
а positive response. We may without loss of generality choose our scale 
80 that с=1, and we do this in order to simplify notation. 

The series of stimuli used in an up-and-down experiment form a 
Stochastic process, whose principal feature of interest is that the stimuli 
tend to have a distribution centered at и. In a series of length n, à 
natural estimate for u is simply the arithmetic mean of the n stimuli 
used. In such а series, however, the first stimulus уо does not contain 
any “information” about и, since it was chosen in advance by the ex- 
Perimenter. On the other hand, the level of stimulus for the (n+1)st 
trial is informative and is known, even though this trial is not per- 
formed. Let С denote the sum of the stimuli used in trials 2, 3, - - - , 
74-1; we shall term this the score. As our first estimate for д we shall 
consider j — Сул. 
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The estimate £ is not quite the same as that given by Dixon and 
Mood, but the two are easily seen to be equivalent asymptotically, so 
we may use for й the asymptotic variance formula derived in [6]. This 
is c4? —2G?/n, where G is a function of d and yo. Dixon and Mood 
give a graph of the function G(d, уо— н); a few representative values 
are tabled below. 


d G(d, 0) G(d, 4/2) 
0 0.886 0.886 
2/8 0.961 0.961 
1 1.004 1.004 
3/2 1.076 1.102 
For given d, the extremes of G will occur approximately at yo— =0 


and at yo—4 —d/2. It appears from this table that for 4 <3/2, G dum 
not sensibly depend on its second argument. This means that the 
asymptotic aceuracy of the estimate does not depend much on the 
“phasing” of д with the system of stimuli. 

Dixon and Mood recommend that d be chosen approximately equal 
to c. The experimenter must guess с and then set d equal to his guessed 
value. The table of G shows that he may err considerably in his guess 
i с without much affecting the asymptotic accuracy of the estimate 

OT p. 

Our concern is with the actual accuracy of £ for small п. We shall 
gauge this by means of the error variance Ё(й— и), which is the natural 
generalization of the variance when biased estimates are being con- 
sidered. Е(Ш- и)? depends on n, yo—u, and d. It may be computed re- 
cursively. Let С„(уо, м, d) denote the score obtained in an up-and-down 
series of length n, with step size d, started at yo. Let Po denote the 
probability of positive response at stimulus yo. We have 


Yo + d + C, (yo + d, и, d) if the first trial fails 
CryalYo, м, d) = Е : 
yo — d + C,(yo — d, и, d) if the first trial succeeds 


therefore 
E[Cria(yo, и, d)] = E[Cilyo, n, d)] + PE [Cs (yo — d, н, 9] 
+ (1 — PjE[C. (yo + d, и, d)] 
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and 


Е(С.н(0, н, d) — (n + Da] 


Е [р + d — u + С, (у + d, и, d) — пи] 1 — Po) 


+ Elyo — d — u + С» (фо — d, m, d) — пи] Ро 


E[Cs(yo, в, d) — n]? + 2(yo — в) (E [Cos (yo, и, а) 
— (n + Du] — Е(С(ю, n, а) — ul} 
+ 2d{(1 — Po)B[C,(yo + d, и, d) — пи] 
- PoE [C (yo — d, n, d) — ть] 
+ (1 — Po)E[Ca(yo + d, n, d) — nu] 
+ PoE[Cn(yo — d, р, d) — na]. 


TABLE 1 
Е()-и AND Е(Ж—н),4=1 


265 


Yorn 


0 .5 1 1.5 2 2.5 3 3.5 4 сл? 
n 
i 0 лат .317 .634 1.046 1.5 2.008 2.500 83.000 | 5 gig 
1.000 .867 .635  .651 1.182 2.312 4.016 6.253 9.000 : 
° 

2 0 416 „242 .22 .701 1.079 1.525 2.007 2.501 | i og 
.507  .284  .582  .567 .697 1.245 2.354 4.034 6.258 

à 0 .072 1180 .331 .529 .792 1.186 1.658 2.017 672 
417 .865  .481  .381 0.559 .813 1.887 2.449 4.081 

4 0 068 .142 .24  .418 .632  .899 1.219 1.602 Mas 
872  .820  .378  .395 0423 .58%6 .968 1.592 2.621 

5 0 .047 116 .212 537 .506  .732 1.006 1.319 ae 
-328 — .984 .3fı  .284  .362 .498 .700 1.100 1.843 

6 0 .047 .097  .170 .282 .432 .66  .839 1.110 en 
-256 — .254  .279  .252 .300  .359 .558 .824 1.346 

7 0 .034  .084  .153 .248  .800  .520 .78 .956 zn 
249 — .227 .240 0.228 . 7 5302  .448  .057 1.042 

8 0 035  .073 .128 .213 .3% .464 .633 .838 .259 
2318 .206 .220 .205  .231 .266  .975 .529 .827 

9 9 026 .065  .137  .189  .285  .412 .568 .746 “294 
200 187 195 0.194  .211 .92 „ЗМ .40 .683 

10 0 088 .059 103 2.70  .277  .871 2.07 .671 202 
180 — .175 .178 171 .189 .220 .281 .870 .570 

«2:9, Ч 
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TABLE 2 
E(@)—» AND E(i-»),? d=2/3 


Yoru 

0 2/3 4/3 2 8/3 bo 
n 

i 0 187 188 1.3864 2.005 “Б; 
444 449 .768 189 4.027 : 

{ 0 268  .609 1.0900 1.687 ы 
.386 зт б 1260 2.862 3 

3 0 .215 .491 .877 1.308 2 
296 .315  .449 .883 1.993 ; 

^ 0 ат з 15 1.162 ind 
1259  .978 .364 ‘668 1.420 ў 
0 449 — .339 .609 984 

» 94 .944 308 .515 1.060 Pi 
0 .128 .290 522 845 

E .209 22, 264 418 1816 E 

HAR, 111258 455 737 

7 191 197 233 "m. 1652 xc 

^ 0 .008 ^ .223 .401 651 si 
A74 180 206 “207 .588 : 
0 40870 .199 1858 .581 

2 164 ^64 188 1958 447 5208 
0 .079 1190 .828 ^ 1525 

1 

Y «148 52 169 1928 .381 185 


These formulas enable us to compute [E(g) —4] and E(g—)*. Begin- 
ning with n=1 we have 


BR) — в = E[Ci(yo, n, d)] — u = yo — в + d(1 — 2Po) 
and 
Е(@ — p)? = ElCx(yo, м, d) — в] 
= (yo = p + d) (1 — Po) + (yo — в — ӘР» 
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TABLE 3 


E(à) a AND Е(й-и), d=3/2 


267 


Уон 
0 3/2 3 
n 
i 0 .200 1.504 
2.225 .001 2.274 
А 0 -150 854 
863 .864 .881 
5 0 -105 604 
134 527 682 
1 0 -080 456 
.496 .494 508 
š 0 064 .366 
-445 1871 .449 
4 0 054 „305 
347 .346 .350 
P 0 046 262 
-320 .282 322 
в 0 .040 .229 
.267 .266 268 
° 0 036 .204 
250 227. 251 
е 
10 0 «.032 1183 
.216 .216 Eu 


4.500 
20.240 


3.750 
14.063 


3.001 
9.010 


2.302 
5.337 


1.862 
3.582 


1.554 
2.546 


1.883. 
1.938 


1.167 
1.506 


1.037 
1.229 


.933 
1.009 


1.158 


.772 


.579 


.463 


.386 


* .331 


.290 


.257 


.282 


From these the moments for n> 1 follow recursively by using the above 


formulae. 


Tables 1, 2, and 3 give E(g) и and E(g—): for d — 1, 2/3, and 3/2 
respectively, and for n—1(1)10. In Table 1,%-н-0(1/2) 3, while in 
Tables 2 and 3, ›—и=0(4) 3d. For comparison we give also the asymp- 
totic variance of c 2=2G2(d, d/4)/n. We see that in all tables, if 
Yo—H 524, the actual Ё(@— p)? is less than c4*, but that as you 1n- 
creases beyond about (5/2)d, Е(@— p)? rapidly increases, particularly 
for small 4. This means that the asymptotic variance gives reliable or 
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even conservative estimates for the accuracy of Б in small samples, 
provided the experimenter can manage to start the process within, say, 
two testing intervals, 2d, of the mean, д. 


ШІ. THE ESTIMATE д” 


In many experimental situations the experimenter will be able to 
guess the value of и to within 2 steps, where the step is the guessed 
value of c. In these situations the estimate й is quite satisfactory, being 
both simple and accurate. But it is always possible that the guess will 
be badly in error, and we now present a modified up-and-down pro- 
cedure which, by sacrificing some efficiency when yo— и is small, pro- 
vides an estimate with a guaranteed accuracy independent of yo— 
It is clear that if we start an up-and-down series at too high a stimulus, 
we shall very probably obtain only positive responses until the stimulus 
level has been reduced to a value near д. Similarly, if yo is far below p, 
this fact will probably reveal itself, through an initial run of negative 
responses. This suggests that we should interpret an initial run of re- 
sponses of the same sign as an indication that уо was badly chosen, and 
that these trials may well be ignored in estimating y.! 

We are thus led to formulate the following design. Choose n in ad- 
vance, and continue the experiment until there have been n—1 trials 
in addition to the trials in the initial run of constant sign. Thus, the 
total number № of trials is a random variable, N =n, with № =n only 
if the first and second trials give contrary responses. As score, we use 
С’, the sum of the stimuli on the last n—1 trials and on the (N-4-1)st 
Шш Pos is, of course, not performed). For our estimate, we use 
n! = C'/n. 

To illustrate, let d — 1, yo—0, n =3. Suppose that the first two trials 
failed, but the third and fourth trials succeeded. This means that the 
successive stimulus levels were 0, 1, 2, and 1, and if a fifth trial was to 
be 2. it would be at level 0. Here N=4, (/-2--1--0-9, 
ш=3/3=1. 

Values of E(u") —и and Е(и’— и)? сап be computed from those for 
E(À) — and E(— u)*. We first compute the probability of each possi- 
ble initial run. Each such run leads to a specific starting value for 8 
series whose moments сап be obtained from tables of Е(0)-н and 
E(ü—u). We give in Table 4 values of E(u’)—u and E(u'—p)* for 
4=2/3, 1, 3/2, yo=0(d) 3d, and n=5 and 10. In Table 5 we give the 
expected number of wasted observations, E(N) —n, for d=2/3, 1, 3/2 
and yo—u=0(d) 3d. 


1 The estimate recommended by Dixon and Mood also attempts to avoid using the results of ini- 
tial trials made far from y, but in a way different from that proposed here. 


SPs 


» 
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Ап appropriate measure for the comparison of two methods of esti- 
mation is provided by the inverse of the sampling variance V of the 
statistic being estimated. We may thus, following Fisher [8], define а 
quantity I such that I—1/V. In the case of the comparison of ex- 
periments involving different numbers of observations it is fairest to 
make the comparison on a per unit observation basis, so we use 
Т'=1/п=1/п7. 

In choosing between the estimates fi and p’, several considerations 
should be kept in mind. If the initial guess for и is fairly accurate, ù 
will have the smaller error variance. In addition, ш involves “wasted” 
observations. Comparing nE (i — и)? with E(N) E(u' — н) for n —10, б is 
seen to be more efficient than u’ when yo— и —0, d, or 2d for d —2/8, 1, 
and 3/2. On the other hand, u’ is more efficient for large values of 1 
|yo—u|. Furthermore, in using д” we have a known upper bound on 
the error variance, depending on d but independent of yo— и. Also, the 
bias of н” is bounded and is considerably smaller than that of Ё for 
Yo—u=2d, and 3d. 

An examination of Tables 4 and 5 suggests a further modification of 
the sampling method which may have desirable features in some ex- 
perimental situations. We observe (a) that the smallest value of d gives 


е 


TABLE 4 ^ 
Е(/-и) AND Е(м’ и), п=5 AND 10 — * 
yo—n 
d : A 
0 d 2d 3d 
E 02: 1g vii BS 267 317 
267 4317 .423 513 
2/3 A 
10 0 .083 ла .168 
A57 167 199 .224 1185 
$ 0 128 187 198 
à 357 410 485 516 
10 0 .065 .095 .101 
191 .202 .222 280 .202 
5 0 074 096 097 
463 507 1547 551 
3/2 
10 0 .037 .048 .049 
.288 .240 251 252 .282 
e A йв COM ошм EE р 
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TABLE 5 


EXPECTED NUMBER OF OBSERVATIONS USED IN APPROACHING 
a BUT NOT USED IN THE ESTIMATE OF y, E(N) —п 


Yoru d=2/3 d=1 d=3/2 
0 0.276 0.162 0.067 
d 0.500 0.493 0.498 
2d 1.115 1.300 1.429 
3d 1.976 2.274 2.428 


the smallest values for E(u'— p)? while (b) the largest value of d gives 
the smallest value of E(N), for a given value of yo— џи. This is intui- 
tively to be expected: a few big steps will get the process into the 
neighborhood of и; but once there, small steps will keep the process 
closer to p. 

Now suppose we begin the process with big steps, but change to 
small steps with the first change of sign. This design may be expected 
to have the small E(N) characteristic of big steps, and also the small 
E(u’ — u)” characteristic of small steps. As an illustration, suppose we 
begin at yo—u=4.5 with d=3/2, and at the first change of sign change 
to d=2/3, We shall have E(N) = 12.43, and a simple computation (us- 
ing graphical interpolation in Table 5) gives E(u’—p)?=0.22, approxi- 
mately. Thus E(N) E(u'—y)?=2.73, approximately. Had we used 
d —3/2 throughout, we should have had E(N) E(u’ — u)? = 12.43 X0.252 
79.18, while the use of d=2/3 throughout would give E(N) E(u'—p)? 
716.7 X0.224 —3.74, approximately. The change in step size has given 
a substantial reduction in E(N) E(u'— и). 

An elaboration of this idea would involve a series of Step sizes, de- 
creasing to zero. Such a model would have an asymptotic efficiency 
with E(N) E(u'— u)* —1.57. In this connection, see [12]. 


IV. COMPARISON OF EFFICIENCIES OF UP-AND-DOWN METHOD 
WITH THAT OF PROBIT ANALYSIS 


A comparison of the efficiencies of the estimates Д and р’ with that 
of the standard probit analysis is of interest but rather difficult. In the 
case of д and и’ the accuracies depend on the choice of d, and for the 
probit method the accuracy of the estimate is markedly affected by the 
choice of the number of stimuli, their spacing, and the distribution of 
the trials among them. Also, the true error variance of the probit esti- 
mate does not seem to be easily obtainable. 

We can, however, make the comparison on the following basis. For 
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the probit method, we shall use a very common design, namely five 
stimulus levels equally spaced at intervals d with equal numbers of 
trials at each level. We shall assume that the stimulus interval d repre- 
sents the experimenter's guess for с, and shall use this same value for 
the step-size in the up-and-down series. E 

We shall gauge the accuracy of the probit estimate by the usual ap- 
proximate formula for the asymptotic variance о’? (equation 3.6 of [7]) : 


ў -[ 1 s: ж | 
E ^m wi У) wz? — Уш, 


where z; are the stimuli, w; their probit weights, m the number of trials 
at each stimulus, and =) wizi/2 wi. 

In Table 6, column 8 gives 1/1” for the probit design specified above 
for yo—u=0, d, 2d, and 3d and d=2/8, 1, and 3/2. From the formula 
given for o,? we calculated mop? and then то, where n=5m=total 
number of observations. Columns 4 and 8 are obtained by abstracting 
the appropriate values of E(f — и)? from Tables 1, 2, and 3 and multi- 
plying by n=5 and 10. For columns 6 and 10, E(N) is obtained from 
Table 5 and Е(и’— и)? from Table 4. Column 5 gives В! defined as 


(1' for up-and-down estimate д) 
(1! for probit method) “ 


° 


В: = 


and column 11 gives Re defined similarly but for и’. Thus nR, is the 
number of observations required with the probit method to give the 
same error variance as would be obtained from ji with n observations, 
and E(N)Rz is the number of observations required with the probit 
method to give the same erroy variance as would be obtained from р’ 
with a number of observations whese expected value is E(N). 

All values of №, and №, exceed 1, indicating that in all the circum- 
stances considered the probit method requires more observations. When 
4-1 (which is the usual objective for d), the probit method requires 
at least 45 per cent more observations than p’, and at least 55 per cent 
more observations than д, for all values of yo~ и considered, 

It might be felt that the above comparison is unfair to the probit 
method, in that we have judged its accuracy by an asymptotic formula. 
The only evidence on this point available to us, however, tends to show 
that the contrary is true. Berkson [2] has reported on an extensive 
Sampling experiment one of whose purposes was to evaluate the small- 
Sample performance of the maximum likelihood estimate for the pa- 
rameters of the logistic function. This function is scarcely distinguish- 


272 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1953 


able from the integrated normal. For thirty trials evenly divided be- 
tween three equally spaced levels, in estimating the location parameter 
with the scale parameter fixed, he found that the actual error variance 
of the maximum likelihood estimate was some 10 per cent greater than 
predicted by the asymptotic formula. (It should, however, be remarked 
that Berkson was able to attain or even slightly better the asymptotic 
variance with a minimum chi-square estimate.) While we realize that 
Berkson's results are not exactly comparable, they lead us to feel that 
the efficiency advantage shown for the up-and-down method in Table 6 
is conservative. Further information may be forthcoming with the pub- 
lication of [3]. 


TABLE 6 
COMPARISON OF EFFICIENCIES OF р AND м” WITH THAT OF A PROBIT ANALYSIS 


n-5 п=10 
d |ю-и| nop 

nE(—4* Ri Е(МЕ(Ш-и Ва |лЕф и)? Ri Е(№)Е(и –и)? Ri 
2/8 0 2.11 137 1.81 1.41 1.50 1.48 1.43 1.61 1.81 
d | 2.08| 1.22 2.19 1.74 1.54 | 1.52 1.76 1.75 1.58 
2d | 5.97 1.54 3.88 2.59 2.31 1.69 3.53 2.21 2.70 
за | 23.6 2.58 9.17 3.58 6.60| 2.28 104 2.68 8.80 
1 0 | 2.82 1.00 172, 1.84 1.53 1.80 1.56 1.94 1.45 
d | 3.13 1.56 2.01 2.25 1.39 | 1.78 1.76 2.12 1.48 
24 | 6.08 | 1.81 3.86 3.06 2.29 1.89 3.70 2.51 2.78 

3d | 63.0 8.50 18.0 8.75 16.8 2.81 22.4 2.82 22.8 
3/2) 0 | 4.15] 2.2 1.87 2.62 1.58] 216 1.92 2.54 1.4 
d | 4.21 1.86 2.27 2.79 1.51] 2.16 1.95 2.52 1.07 
2d | v.71| 2.24 3.48 8.2 2.19] 217 3.55 2.87 2.69 

за |44 5.60 74.0 4.09 101 4.04 103 3.13 192 


We remark that the sense of the,comparison of Table 6 is not de- 
pendent on the particular probit design we chose. A probit design which 
concentrated the trials more tightly would gain in efficiency somewhat 
for yonear д, but would pay for this by blowing up more quickly as the 
initial guess became bad; and conversely, a looser distribution of the 
trials would give fair performance ever a wide range of уо values, but 
would be even less efficient at уо near д. 


V. DESIGNS WHICH REDUCE THE LENGTH OF THE EXPERIMENT 


The necessity of performing the trials sequentially is the principal 
objection to the use of the up-and-down method in many quantal re- 
Sponse experiments. The total time required is likely to be prohibitive 


ә 
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unless the response to the stimulus is immediate, as for example іп 
explosives sensitivity testing? 

We shall now investigate the possibility of mitigating this disad- 
vantage by running several short up-and-down series simultaneously, 
rather than a single long series. Let there be k series, each of length n, 
with m=kn the total number of trials. We shall conduct the series 
independently of each other, and use the same initial stimulus уо and 
the same step size d for all k series. From the ¿th series we form the 
estimate рг’ as in section 3, and for our over-all estimate use 


A 


Clearly о? = т„?/Ё, while E(g) = E(u). 
Thus 


E(g — uy = ez + [E(g) — и] 


= E(u' — u)?/k + [E(u’) — n] (e — 1)/k. 


The error variance of g may be obtained at once from Table 4 for 
d —2/8, 1, 3/2, n2 5 and 10. 

To illustrate, consider four independen$ series each of length ten 
(k=4, n —10, т = 40). The following table gives the error variance cor- 
responding to several value of yo—4, where we have taken d=1. We 
also give the values of Р. = (4065?) /[AE(N) E(a— и). 


уои Е(0— и)? Ез 
0 0.048 1.45 
1 * 0.054 1.39 
2 f 0.062 2.48 
3 0.065 19.7 
4 0.066 = 


7 31% should not, however, be thought that the total time required to run an up-and-down series is. 
^ times as great as the time required to conduct a probit experiment with n trials. In the first place, the 
Breater efficiency of the up-and-down method means that fewer than n trials are needed with it: as is 
explained in Section 4, something like 2n/3 trials will give the same accuracy as is obtained from а 
probit analysis based on n trials. Secondly, in many experiments, particularly in biological ones, the 
“sponse time is not fixed but variable. The time required for the probit experiment is thus the mavi- 
Tu of n response times while that required for the up-and-down method is the sum of 2n/3 response 
times. To illustrate, if response times have a chi-square distribution with six degrees of freedom (this 
distribution fits fairly well a series of response times published by Perry (11, p. 40]) and if п =30, the 
s time required to complete the up-and-down experiment will be 7.6 times as long as that for the 
Probit experiment, rather than 30 times as long as one might at first think. 
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For a single series of length 40, the variance as given by the asymptotic 
formula is 1.008/20 —0.050. It appears that there is no appreciable loss 
of efficiency in breaking a-series of 40 into four series of 10 provided 
that a good guess for и is made. We are also assured of a low bound on 
the error variance even if the guess is bad. For all values of уо — и còn- 
sidered, the ratio E; is at least 1.30. This indicates that a 5-level probit 
design would require at least 30 per cent more observations to yield an 
estimate of the accuracy of g, even for the most favorable choice of yo. 


VI. COMPARATIVE EXPERIMENTS 


In much work of the type considered in this paper, we do not make an 
absolute determination but rather a comparison of an unknown with 
some standard. What is desired in these situations is an estimate for 
the difference between the values of р associated with two treatments. 
For example, in bioassay we may wish to estimate the amount by 
which the LD 50, mı, of a new drug falls below that, ду, of a standard 
drug. It is clear that the up-and-down method may be adapted to this 
kind of problem. We can simply run two series, one for each drug, and 
take ji; — fi» as our estimate for и: — us. The possibility of running sev- 
eral short up-and-down series opens up additional ways of economizing 
on sample sizes. > 

It is well known that the response of an animal to a given dose 18 
affected by (amongst other things) its weight, age, and previous history. 
The usual method of avoiding these disturbances is to select animals a8 
homogeneous as possible, or, in the case of weight, to make the log 
dose simply proportional to the log weight. While in some cases this 


procedure will achieve a satisfactory compensation for the varying ani- 


mal weights there are others where the relationship is either unknown 
or less simple. It is not safe to assume simple proportionality with & 
regression coefficient of unity, as regression coefficients of log dose on 
log weight as diverse as 0.573 [5] and 1.511 [4] have been reported. 

In an orthodox probit comparative experiment, stratification with 


respect to some concomitant variable, such as animal weight, is possi- | 


ble, but it is not practical to take account of it in the analysis. In ап 
up-and-down comparative experiment, however, the whole series of 
say 80 animals can be broken up on the basis of weight into say 4 
groups of 20, and 10 animals from each such group allocated at random 
% the standard and 10 to the unknown. This would give us morè 
homogeneous material and a correspondingly reduced standard error. 

Due to a paucity of suitable data in the literature, it is difficult 10 
predict what magnitude of gain may be expected through this sub- 
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division of the material. However, as an example we may take Behren's 
data [1] on digitalis in frogs. For February 1928 he measured the fatal 
dose of digitalis for 38 frogs varying in weight from 21.8 to 37.5 grams, 
and as usual expressed the dose as a fraction of the frogs’ weight. How- 
ever, a plot of dose, so defined, against frogs’ weight discloses that the 
heavier frogs require a smaller dose. If we rank the frogs in order of 
increasing weight and break up the group of 38 into 4 groups of 8, 10, 
10, and 10, then the variance of dose in his units within these groups is 
0.0035550 compared with an over-all variance of 0.0046134. The ratio 
of these variances is 0.771: thus 7.7 animals when used in the smaller 
groups will give the same information as 10 animals in the complete 
group. 

In the case of relatively large experiments, of course, the possibility 
exists of obtaining still greater homogeneity by classifying the animals 
into groups on the basis of more than one parameter. 

A further possibility is the re-use of survivors from previous experi- 
ments. In general, we would expect the tolerance to change, and the 
inclusion of these animals in a comparative probit experiment would 
increase the heterogeneity of the material. However, in an up-and-down 
comparative experiment broken up into small groups, the survivors 
from earlier experiments could form some of the groups. This procedure 
if used with discretion should allow a saving of up to 50 per cent in the 
usage of animals with the up-and-down method. There is also, of 
course, a further reduction compared with probit experiments due to 
the relatively greater efficiency of the up-and-down method. 


VII. CONCLUDING REMARKS 


In choosing between the standard probit method and the up-and- 
down method for estimating ш,«ће following points may be kept in 
mind, 4 

(1) The probit method will require substantially more observations. 
It shows up best, relatively, when the guess for и is accurate, but even 
here (using a standard 5-level probit design with stimulus interval 
equal to one standard deviation) the probit method requires at least 
50 per cent more trials than does the up-and-down estimate Й in series 
of length 10. The comparison becomes rapidly less favorable to the 
Probit method if the initial guess is not accurate. When the central 
stimulus is off-center by two standard deviations, the probit method 
requires approximately 3 times as many observations 48... \ 

(2) With the up-and-down method applied sequentially, it is possible 
to obtain an estimate ш with a guaranteed accuracy, regardless of 


^ D 
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jnitial guess for д (but depending, as does the probit method, on the 
guess for о). This estimate, while less efficient than р for small yo—y, is 
still more efficient than the probit method, which requires at least 40 
per cent more trials on the average to give the same accuracy, when the 
guess for c is correct. 

(3) The small-sample performance of the up-and-down estimates is 
known, whereas that of the probit method is unknown and difficult to 
compute. The above comparisons are made on the basis of the asymp- 
totie variance formula for the probit method, which is probably opti- 
mistic. 

(4) The up-and-down estimates are easy to compute arithmetically. 
The probit estimate must be computed either by a troublesome itera- 
tive method, or by a graphical method involving the judgement (and 
hence the bias) of the computer. 

(5) The up-and-down estimates always exist, whereas the probit esti- 
mate may not exist. In small samples with уо — и large, there is a sub- 
stantial probability that the probit estimate will not exist. 

(6) The probit method has the advantage that all trials may be per- 
formed simultaneously, whereas the up-and-down method requires that 
trials be made sequentially. We may, however, run the up-and-down 
trials in a number of parallel short series, without much loss of accuracy. 
In a typical situation, the up-and-down experiment may take 2 to 5 
times as long to complete as the probit experiment. 

(7) The up-and-down method, when arranged in short series in an 
experiment to compare treatments, can take advantage of any classifi- 
cation of experimental material designed to reduce the variance. This 
may lead to a further substantial increase in efficiency relative to the 
probit method. 

In summary, we believe that the up-and-down method will prove 
superior to the probit method in any situation wherein the arrangement 
of the trials in series is not prohibited by experimental or cost considera- 
tions. Certainly there are many laboratories now using the probit 
method which would profit from a change to the up-and-down design. 


j Our thanks are due to Mr. J. McKibben for much of the computation 
included in this paper. 
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DESIGNING SINGLE-SAMPLING INSPECTION PLANS 
WHEN THE SAMPLE SIZE IS FIXED 


ÅBRAHAM GOLUB* 
Ballistic Research Laboratories, Aberdeen Proving Ground 


A simple technique is developed for determining “best” sin- 
gle sampling inspection plans when the sample size is fixed. 
Two cases are considered: (1) single sampling plans for placing 
2 lot into one of two categories and (2) single sampling plans 
for placing & lot into one of three categories. For case (1), the 
“best” plans are based upon a criterion of minimizing the sums 
of the producer's and consumer's risks. А similar criterion ін 
employed as a basis for choosing “best” plans for case (2). The 
more general result for m categories is given also. Tables 1 
through 8 are presented to enable the user to choose the ap- 
propriate sampling plans. 


INTRODUCTION AND SUMMARY 


HERE have been various approaches to the problem of designing 
single-sampling plans for acceptance sampling by attributes. It is 
generally accepted, however, that the primary objectives are twofold: 
(1) to accept at least a stated high proportion, say 1— а, of lots having 
true fraction defective not exceeding a given acceptable quality, say 
pı and (2) to accept not morë than a stated small proportion, say 8, of 
lots having true fraction defective not less than a given objectionable 
quality, say p» The quantities о and В are commonly known as the 
producer’s and consumer’s risks, respectively. The problem of selecting 
a single sampling plan is one of determining the parameter n (sample 
size) and the parameter c (acceptance number) which will simultane- 
ously attain both of the aforementioned objectives. Grubbs has solved 
this problem and by using his tables it is quite simple to choose the 
single sampling plan which gives the'desired protection.! 

There are many occasions in practice where for economic, adminis- 
trative, or practical reasons т is fixed and small (less than 50). For 
these cases, the attainment of the desired protection is usually impossi- 
ble. A plan could be devised in which o or В is controlled; however, this 
may unnecessarily penalize either the consumer or producer. With п 
fixed, therefore, the problem is one of choosing а “best” compromise 


plan based upon some criterion. This paper proposes a solution of this 
problem, 


* The author wishes to express his appreciation to Dr. F. iewing thi i 
: author ; r . Е. E. Grubbs for reviewing this paper ап 
for os interest and stimulating comments throughout its preparation. 
Mu os se далып to consider this problem; his tables, however, give his solu- 
porta tages. тапк E. Grubbs, “On designing singl ing inspection plans, 
Annals of Mathematical Statistics, 20 (1949), 242-56. © ҮШ o с 
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Тһе criterion which is employed as a basis for a *best" plan is that 
the sum of the probabilities of accepting lots of true quality p; (accept- 
able quality) and rejecting lots of true quality pz (objectionable qual- 
ity) be a maximum. Mathematically, the expression which is to be 
maximized may be written as 
(1) Р = Рғ,(А) + Pra (E) 
where Pr, (A) represents the probability of accepting lots of true qual- 
ity =р and Pr,,(R) represents the probability of rejecting lots of true 
quality = p». It is noted that in employing this criterion о and 8 may 
vary, whereas in the basic problem o and В are set values. For fixed n, 
the value of c which maximizes (1) is the integer nearest to 


т 


(2) Ті 


where ¢,=1—p; and q3—1— ps. у 
This paper also considers the problem of;obtaining single sampling 
plans involving two acceptance parameters, сі and c», for the purpose 
of placing a lot into one of three categories, say 1, 2, or 3. Three quality 
levels are set; р for category 1, ps for category 2, and ps for category 3. 
In this case a similar criterion for a “best” plan is employed. Mathe- 
matically the expression which is to be maximized may be written as 


(3) P = Pry,(1) + Prp,(2) + Prp,(8) 


where Pr, (1) represents the probability of placing lots of quality pi 
into category 1, Pry,(2) fepresents the probability of placing lots of 
quality рз into category 2, and Pr,,(3) represents the probability of 
Placing lots of quality ps into category 3. This criterion’ obviously 
Minimizes the sum of the probabilities of misclassification. 
The solutions for the c; and ca which maximize (3) are the integers 
nearest to 
1 4 т 
C= 
1 2 i m 
(4) OB 
p ga 
Ч: 
log — 
9 
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1 n 
с = 2 3 
log — 
i E 
Фа 
log — 
q3 


For the general case of classification into m categories, the solution 
for the c; are 


(6) Pi 


The tables at the end of this paper are sampling tables which enable 
the user to select the proper acceptance numbers for single sampling, 
based upon the solution developed herein.? The sample size, n, ranges 
from five through forty in intervals of five. The lot qualities range from 
.01 to’.20 for acceptable qualities and up to .40 for objectionable quali- 
ties. Lot qualities are tabulated in intervals of .01. 

Examination of formulas (2), (4) and (5) indicates that these same 
tables can be used also to choose c; and c; in obtaining solutions for 
classification into three categories. This is accomplished by repeated 
entry into the tables, first with p; and рз to obtain сі and then with p2 
and рз to obtain сз. In a similar manner, the c; for the general case of 
m categories can be obtained from the tables. А 

The solutions developed in this paper are based upon the assumption 
of a Binomial Population. | 

In summary, tables are provided which enable the user to determine 
quickly the acceptance numbers, c, for fixed л, so that the sum of the 
probabilities of misgrading or misclassifying is а minimum. For those 
cases where т exceeds the range of the tables, a simple computation 
employing formula (2) yields the desired results. 


? These tables originally appeared in this author's “Acceptance Numbers for Placing a Lot ТЕҢ 
ich a Single Sample is Drawn into One of Three Grades,” Memorandum Report No. 581, Ballisti 
Research Laboratories, Aberdeen Proving Grounds, Md. (Nov. 1951). 
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EXAMPLES 
I. Given: 
п = 40 
pi = .01 
р» = .08. 


Find c so that a+ is a minimum, 

The last table yields c=1 when entered with p,=.01 and р: = .08. 
Thus the single sampling plan which minimizes a+ is n=40; с=1. 
Specifically, а —.06; 8 —.16 and a-4-8—.22. 


II. Given: 
n = 25 
pı = .01 
рз = .10 
P: = .30. 


Find c; and e; so that Prp,(1)-+Prp,(2)-+Pr»,(3) is а maximum. 

The fifth table yields c; —0 when entered with p,=.01 and p»—.10 
and also сь=4 when entered with p;—.10 and рз=.30. Specifically, 
Pra(1) 2.778; Prj(2) = .830; and Prys(3) =.910 and their sum is 2.518. 


THEORETICAL BASIS FOR CONSTRUCTION OF TABLES l THROUGH 8 


, Assuming an acceptable quality level of p; and an objectionable qual- 
ity level of р:(р: < рә) and the sample size n fixed, we can write 


E 


т) Pra(4) = È 


© pin 
074 (т 1 i) ! 


с 


(8) Р"Е)-1- > ap 


Where, assuming a binomial population, Ри»(А) represents the prob- 
ability of accepting lots of true quality p; and Pr,,(R) represents the 
Probability of rejecting lots of true quality p». The criterion for a “best” 
plan, as outlined in the introduction, is that 


9) Р = Prp(A) + Prp(R) = maximum. 


Ког с to be the acceptance number that maximizes (9), provided that 


282 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1953 | 
! 


(9) has but опе maximum, the following functional relationships must 
hold: 


(10) Р, Р.1> 0 
(11) Ра-Р,<0 


where the subscript denotes the acceptance number. Expanding (10) 
and (11) yields 


n! 
(12) M (n'a — prar) > 0 
сЦп- с)! 
n! 
13) ----------(реа”-ен) — patiga let) 0. 
(13) (6X Dm e pi ^ Фа pat qs ) < 


Dividing both sides of the inequality (12) by the positive quantity 
n!/[el(n—c)!] gives 


(14) р" — раи" > 0 
which is equivalent to 
(15) (ee PN ° ны 


" 


Since (раз/а) is less than unity, taking logarithms of both sides of 
(15) shows that c is an integer satisfying 


log Шы 
(16) 


с> 
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т, т 


с oe 
log m log 2 
18 1 
ap) Mit. cT Tod 
Ф Ф 
log — log — 
Ф Ф 


From (18) it is obvious that с may be obtained by evaluating 


1 4 n 
3 log 
19 
E T 
qı 
log — 
Ф 


and rounding to the nearest integer. If the value of (19) falls exactly 
midway between two integers then equal maxima exist at those two 
integers. 

Ап alternate and interesting approach for obtaining the acceptance 
number c which maximizes (9) is to approximate (9) as follows: 


% € 


n! 
Pea pig” + ———— pig? Tee 


Oln! П — 1! 
ek om! 
т. 
j-1/2 al(n = 2)! Docs 
(20) B 
т т 
+1- (2 б» Dye ese a 
oni ?* ҮЛ Tees тті с, 
eH? 1] 
ч / ————— py eu) 
iam 2(п- а)! sn 
Where j may be set arbitrarily equal to 0, ог 1, or 2, - - - , but at some 


value less than c. The range of integration for a particular term, say the 
(k+1)st goes from 5-2 to 4-2, thus employing a useful device for 
passing from a discrete variable to a continuous variable. Differentiating 
(20) with respect to c and equating the result to zero (the condition for 
a maximum) gives ; 
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n! 
С Bln-c- BI? 


тещи 


(21) 


п! 
о 
(c + Yn —c— 3)! 
Taking logarithms of both expressions in (21) and solving for c yields 


ое аа” (у 


1 a n 
боны Сй AES 
2 р { 
log — 
D E 
Ф 
log — 
92 


the same expression as (19), which had been derived exactly. Thus the 
value of c which maximizes the approximation (20), when rounded, is 
equal to the c value which maximizes (9). This justifies the use of the 
approximation and provides an efficient means of dealing with certain 
optimum problems involving the binomial distribution. 
The tables in this paper are based upon the solution developed herein. 
% 


„SINGLE SAMPLING PLANS FOR PLACING A LOT INTO ONE OF 
THREE CATEGORIES 
Given quality levels pı, p», and рз(р: & pa « pi) for category 1, cate- 
gory 2, and category 3, respectively, with fixed sample size n, we can 
write 


er 


! 
(ОРАҚ р ЗЕ S uentus тармен 
) 2 Pra (i) > m p hm 


а n! 


2 n! 
E оа pomo ! E 
„(2) p БҮЛӨ кү Әз” > жс! TT page 
с. n! 
Q4 Р”,(8)-1- Ж раз" 


2 iln 21 


where Рғ,,(1) represents the probability of placing lots of quality 2! 
into category 1, Pr,,(2) represents the probability of placing lots into 
category 2, and Pr,(3) represents the probability of placing lots of 
quality рз into category 3. 

Using the criterion discussed in the introductory section, it is desired 
to maximize the following expression: 
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(25) Р = Pra (1) + Pr4Q) + Рг„(3). 


It is evident that, for any fixed value of сз, the solution for сі which 
maximizes (25) is equivalent to the one developed in the preceding 
section. Similarly, for any fixed value of c; the solution for c; is equiva- 
lent also to the solution developed in the preceding section. Thus the 
values of сі and сг which maximize (25) are simply 


1 n т 
а---- andi о ep remo 
2 pi ps 
log — log — 
[i E DEA i Pa +1 
1 zs log 5 
Ф qs 


where с: and сз are rounded to the nearest integer. 

Following this line of reasoning, it is obvious that the preceding re- 
sults can be applied to the general case of m categories. That is, if 
Di Po, *** Pm are the respective lot qualities of categories 1, 2, 3, 

+++, m, then the c:(i=1, 2, 3, -- +, m—1) which maximize 


(26) р = Жб” 


i=l қ 


are given simply as 


1 + n 
а= – — 

2 рі 
log —— 


(27 Қ 
| оси а 


* gi 
ө log —— 
9+1 
It is obvious from (27) that the tables can be used to obtain any ci by 
entering them with p; and pia. 
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ON ERRORS IN MATRIX INVERSION 


Pau. 5. DWYER 
University of Michigan 
AND 
FREDERICK V. WAUGH 
Bureau of Agricultural Economics 


I. INTRODUCTION 


ATHEMATICAL statisticians have given a great deal of attention to 
M sampling errors, much less to errors of computation, and very 
little to errors due to faulty data. Before the statistician has confidence 
in his results, he should explore all three kinds of errors. 

This paper does not consider errors of sampling, since much has been 
written on that subject. A section is devoted to the discovery of the 
amount of error resulting from the use of approximate computational 
methods, but the paper is devoted primarily to a consideration of the 
effects of errors in the original data—and specifically to the size of the 
maximum error in a particular element of an inverse matrix due to the 
inherent errors in the elements of the original matrix. 

The computation of an inverse matrix is the key to such important 
statistical problems as multiple regressién, factor analysis, and the 
Leontief technique of “inter-industry analysis.” The methods.used in 
this paper might be applicable to related problems, such as finding 
bounds to the errors in solving simultaneous equations, but it seemed 
best to limit this paper to errors in elements of the inverse matrix. 

Perhaps a few words should be said here about the nature of inherent 
errors (or inaccuracies) in the original data. Seldom can a statistician 
get absolutely accurate datasin any field of work. Even in physical ex- 
periments, the data are at least subject to errors of measurement. In 
economics, the statistician must work with estimates of such variables 
as crop yields, price indices, and national income. In all such cases, it 
is extremely important that the statistician have information concern- 
ing the reliability of the data he uses. Unfortunately few statistical 
agencies of the Government, or of the trade, provide such information. 
This situation should be rectified to whatever extent is possible. In any 
Case, a statistician should always find out as much as he can about the 
data he uses, and should form at least a rough judgment of the maxi- 
mum errors associated with each series. Without such judgment, he 
cannot be sure that his analysis is valid. Ы 

А recent book [5] and paper [4] by one of the authors give consider- 
able attention to problems related to the subject of the present paper. 
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A formula for the first order discrepancies in the elements of the inverse 


matrix due to inherent errors of the data is presented (5, p. 286]. The | 


present paper extends these results 

(а) by presenting methods for establishing a bound for the total dis 
crepancy, 

(b) by showing how the actual extreme discrepancy for each element 
of the inverse may be determined and calculated. 

The results are applicable to the problem in which the elements of 
the matrix to be inverted are subject to unspecified but limited errors. 
Thus each element of the matrix is composed of a known part, aij, and 
an unknown рагі, ¢;;, with the specification that lei] Ет. The т) are 
the bounds for the errors of the elements. 

"There are several problems which are closely related to the problem 
of the errors of the inverse matrix. These are 

(a) the problem of the bounds and extreme values of the determi- 
nant of the matrix A, 

(b) the problem of the bounds and extreme values of the roots of 
equations which feature А-1, 

(с) the problem of adjusted equations in which the values of the еу 
are known. 

The amount of material we found which seemed necessary to an 
adequate discussion of the topic of errors of matrix inversion was 80 
extensive that we decided to treat this specific problem in this paper. 
The other problems suggested above, though closely related to the prob- 
lem of the errors of matrix inversion, are not treated here except as 
they contribute to the subject of this paper, though several references 
to important papers dealing with these other problems are given. 


IL AN EXAMPLE TO BE ANALYZED 


An illustration is next presented in order to clarify the nature of our 
problem. For this purpose, we have chosen a problem of economics and 
statistics which was recently discussed by one of the authors [26]. The 
problem was to determine the least expensive combination of feeds for 
dairy cows which would meet, or surpass, certain stated requirements. 
This is a practical application of linear programming. The study con- 
cluded that a combination of milo, middlings, gluten, and bran was the 
least expensive dairy ration meeting the requirements for total digesti- 
ble nutrients, digestible protein, calcium, phosphorous, and total 
weight. The analysis of this problem involves the computation of an 
inverse matrix. We wish to determine the maximum possible error in 
the elements of this inverse matrix, 


= 
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“Тһе essential data are given in Table I. 


TABLE I 
‘Tie original matrix 0.495 0.275 0.098 0.459 
; 0.436 0.449 0.264 0.410 
A 0.395 0.557 1.819 0.385 Absolute 
0.423 0.436 0.467 0.407 row-sums 
Maximum inherent errors, 0.0010 0.0005 0.0005 0.0010 0.0030 
0.0010 0.0010 0.0005 0.0010 0.0035 
0.0010 0.0010 0.0015 0.0010 0.0045 
0.0010 0.0010 0.0010 0.0010 0.0040 
Computed inverse, 6.483 5.158 3.311 —13.626 28.578 
—4.798 6.485 —0.841 —0.285 12.409 
с 0.094 —1.627 1.084 —0.147 3.552 
—2.086 —9.095 -8.299 . 14.806 29.376 
Absolute column-sums 14.061 22.360 8.535 28.954 73.910 


The first section of Table I exhibits the original matrix. The elements, 
азу, represent the proportion of the jth requirement, which can be sup- 
plied by one dollar's worth of the ith feed. These elements were com- 
puted using average wholesale prices in Kansas City from October 
1949 through September 1950, and using commonly accepted require- 
ments. = 

Тһе second section exhibits an estimate of the maximum absolute 
error, 75, associated with each element of A. These estimates are not 
exact, and warrant further study, but will serve the purposes of this 
paper. Obviously, an error of at least 0.0005 is possible for each ele- 
ment, since the data were rounded to three decimals. Somewhat larger 
errors are possible because of variation in the nutritive content of the 
feeds, We assume no error in prices, The problem might be stated thus: 
If these feeds could be bought at quoted prices in the 1949-50 feeding 
Year, and if we assume feeds of approximately the average quality, how 
much of each of the four feeds would be required? The assumed ту ap- 
pear reasonable from this standpoint. We also consider the case in 
which 5;—5.—0.0005. This would be an appropriate assumption if it 
Were assumed that feeds of exactly average quality could be bought at 
the precise quoted price. 

Finally, the last section of Table I shows the inverse of A, which we 
designate C. It was calculated by a method which first obtains the 
adjoint of A [27] [5, p. 217]. In this way, we can guarantee that the 
computed elements of C are correct to the three decimal places re- 
corded. This value of the inverse, the elements of which are correct to / 
three decimal places, can also be computed by the method of Section 13. 


" 
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To the right of Table I are shown the sums of absolute values of; - 
elements of rows. Below аге shown the sums of absolute values of 
elements of columns. These sums are used in our analysis. E 

We can guarantee that С is computed to three decimals of accuracy, | 
But we have not considered the possible effects of the inherent errors, 
74. The main purpose of this paper is to determine bounds, or extreme 
values, to the discrepancies between elements of C, and elements of ће - 
inverse of a “true” matrix which is free from inherent errors. 


III. GENERAL FORMULAS FOR THE INHERENT ERRORS OF THE INVERSE | 


We first consider the accumulation of the inherent errors. Let 7 bea 
real matrix which is composed of an approximate part A and an error 
part E. Then 


(3.1) Т= А Е. 


The matrix E is usually composed of unknown elements, е;;, which may 
be positive, negative, or zero, but which are not greater in absolute 
value than some bound 7,;. We desire to find the inverse of T knowing 
the value of A and the bounds for E, or alternatively, we desire to 
know how much the inverse of T' might differ from the inverse of А. 
It is necessary of course that both A and T be non-singular. 

In mostapplied problems the bounds for E are small when compared 
with the corresponding elements of А, i.e., the relative errors of the 
elements of А are small. A major concern of this paper is with this 
situation though some of the formulas are appropriate to the more 
general case. 

In the first part of the paper we assume that А-! can be obtained by 
exact methods, or, at least, that a correct k decimal place approxima 
tion can be computed. We get immediately from (3.1) writing С= А! 


(8.2) T= = (A + E)? = [I+ EC)A]H = CQ + EC). 


If the elements of E are small enough so that any norm! of EC is less 
than 1, it can be shown [25] that Т- can be expressed in terms of the 
convergent series [7, p. 40], 


- (83) T = C|[I + EC + (EC)? — (EC) +--+]. 
Pre-multiplication by A shows that 
(83а) — I— EC + (BC)? — (E0): + --- = ATA = (I + EC)" 


1 See Section IV for a discussion of norms, 
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n 22 Тһе accumulation of the inherent errors in the inverse is given by 
the difference between 7-1 and A-— 0. In this paper we call this the 
discrepancy matrix, D. Then 

(3.4) D-T3-—C- — CEC[I — EC + (EC? — ·.. ] 
(849  D-T3—C- —C[EC — (EO? + (ЕС) — .-. |, 
Equations (3.4) and (3.4а) are basic to all the sections of this paper 


dealing with inherent errors. 
From (3.1) with C — А-! one has 


(85) D=T"— A~ = Т-ҚА — T)A = - TEC 

À = — (D + C)EC 
80, solving for D 
(3.6) D = — CEC(I + EC). 

Alternative forms of the above formulas are 
(8.2), T= = (A + E) = [A(I + CB) = (I + CE) 
(433) T€! = [I — CE + (CE? — (CE), - - ]C 
84) D-T3—-0-2-[I — CE + (CE)? — --- ]OEC , 
(5) р= т 4 = — СЕТІ-- CE(C + D) 
8.6)’ D= T- — C = — (I + CH)CEC. 

Neglecting terms of (3.4) of order higher than one in E we get: 
(3.7) Die — СЕС, 


Where D; is the matrix of first-order discrepancies. 
This formula could be written somewhat symbolically as 


(3.8) d(A-!) = — А-ҶаА)А-! 


Which is а matrix generalization of the formula for 4(2—1). Other deriva- 
tions of this formula (or of the corresponding derivative formula) are 
available [5, p. 285], [14], [6]. 


ll 


IV. THE NORM OF THE DISCREPANCY MATRIX 


We wish to establish an upper bound to the elements of D. One 
method is to use some norm of D. For our purposes we define N (А), the 
Dorm of the rea] matrix А, as a quantity which satisfies the conditions 


“ 
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(a) N(4) z 0 E 
(b) N(A) = 0 if and only if A = 0 

(e) N(cA) = [| N(A) where с is a scalar 

(d) N(A + B) = N(A) + N(B) 

(е) N(AB) = N(A)N(B) 

(f) N(A) — 1if A is a fundamental unit matrix (with all 


elements zero except one element which is unity). 


(4.1) 


These conditions are in substantial agreement with those advocated 
by Bowker [2]. Application of these conditions to (3.48) gives with: 
N(EC) <1 


(43) мә) < МОМО) 
1- N(EC) 
and if N(E)N(C) «1, | 
(43) №) < A@ WO! 3 
1 — N(E)N(C) 


Several norms are available. In this section we feature norms ad- 
vocated recently by Hotelling [11], Bowker [2], and the maximum 
coefficient, related to a norm, advocated by Turing [24]. These are 


1/2 
(4.4) NA) = te гу) = (trace А”А)М?, 
m 
(48) МА) = max ХІ aul, | 
i i | 
(4.6) M(A) = max | aij: 
$ 


Tt is to be noted that nM(A) is a norm. The M(A) used by Lonseth [13] 
appears to be a norm as does the max; [A;(A’A)]"/?, where A; is the 
characteristic root of A’A, of Wittmeyer [29]. However the M(A) and 
m(A) of Lonseth and the max; [\;(A’A)]}¥? and min; [,(4/4)]? of 
Wittmeyer play the more precise role of bounds. 

We calculate norms for the discrepancy matrix of Table I. We com- 


pute М(И) from the лу matrix and N(C) from the elements of the in- 
verse matrix. We have 


(4.7) МЕ) = 0.003873, №(С) = 25.60 во N4(D) < 2.82. 
(48) МЕ) = 0.0045, М(С) = 29.376 so М0) < 4.47. 
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1f the Bowker norm is applied columnwise rather than row-wise, we 
get 


(4.9) М,(Е) = 0.0044, N,'(C) = 28.954, so Мь(р) = 3.79. 
It сап also be shown that when n?M(E)M (c) «1 

А n?M (E) [M(0)]? 
(4.10) M(D)s M E 
Then қ 

(4.11) М(Е) < 0.0015, M(C) = 14.896, so M(D) < 8.29. 
The use of nM(A) as а norm gives 

(4.12) Nu(E) < 0.0060, Үм(С) = 59.584, so Nu(D) = 33.154 


and no element of D is larger than 33.154/4 —8.29 as is shown in (4.11). 

From the calculations above we could use any of 2.82, 4.47, 3.79, or 
8.29 as an upper bound to the discrepancy elements. We are free to 
choose the lowest available bound. Naturally we prefer a bound of 2.82 
to one of 8.29 but even the bound 2.82 is too high for much confidence 
in the precision of the elements of the inverse matrix of Table I. It is 
not surprising though that the norm which requires the most detailed 
calculation is the one which gives us the best results. i 

These norms are all relatively easy to compute and in some cases, 
Particularly when all the terms of the matrix have approximately equal 
absolute-values, may be satisfactory. In general, however, they provide 
rather loose bounds to the discrepancies, фу. Commonly, as in the 
Present case, we need a closer,bound for each d;;. We proceed therefore 
to consider the possibility of establishing closer bounds, or perhaps 
lower extreme values, Нап those found in this section of the paper. 


V. BOUNDS FOR THE ELEMENTS OF THE DISCREPANCY MATRIX 
WHEN THE ERRORS HAVE A COMMON BOUND 

An important special case is that in which the maximum absolute 
error in any element of A is у. For example, if the elements of A are 
accurate to three decimal places, the value of the error of any element 
is between —0.0005 and 0.0005, and we say 1=0.0005. — - 

A method is available [5, p. 285] for computing the first order ap- 
Proximations to the elements of the discrepancy matrix. This is done 
by assigning the elements of E in (3.9) as either —7 or 7 in such a man- 
ner аз to make a particular element of D; take on the extreme value. 
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А different set of values of E шау be necessary to obtain the extreme 
value for each term in the first order discrepancy matrix. The results 
are given by the formula 


(5.1) Blan] = "ХІ e|. Z| cal. 


The calculations are illustrated in Table II where the A and C 
matrices are those of Table I and 70.0005. The sums of the abso- 
lute values of the rows and columns of c;; are placed in the margins 
of the first matrix of Table II. These marginal values are multiplied to 
obtain the elements of the matrix. These products are multiplied by 
л =0.0005 to obtain the values B[d1;rs] which are recorded in the second 
matrix. 


TABLE II 
Products Row Sum 
Products 401.765 638.892 243.871 827.303 28.573 


174.483 271.465 105.911 359.290 12.409 
49.945 79.423 30.316 102.845 3.552 
413.056 656.847 250.724 850.553 29.376 


D) 


Column sum 14.061 22.360 8.535 28.954 73.910 
B[ds] ^ 0.201 0.319 0.122 0.414 

0.087 0.139 0.053 0.180 

0.025 0.040 0.015 0.051 

0.207 0.328 0.125 0.425 
В[а„] 0.209 0.331 0.127 0.430 

0.090 0.144 0.055 0.187 

0.026 0.042 20.016 0.053 

0.215 0.341: 0.130 0.441 


Ноу good are these first order discrepancies? One of the purposes of 
this paper is to show that the results of the method described above 
сап be extended, with little more computational work, to provide 

bounds for the actual discrepancy matrix rather than for bounds for the 
first order approximation to it. This can be done if N (EC) «1 and if 
"ИО в). 


We show below that a bound for d, is given by the formula 
22] eri] >| esl 
i i 


E Re m T 
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This formula was given essentially, though not explicitly, by Lonseth 
[13, p. 197] who directed his attention primarily to bounds for the errors 
of solutions of equations rather than to bounds for the errors of the 
inverse matrix. The general nature of Lonseth's proofs should be of 
interest to the more mathematically inclined readers. 

This formula indicates that we need only multiply the first order 
discrepancies by 1/(1—7 УЫ; У сі] ) to get discrepancy bounds. Alter- 
natively these bounds may be obtained by multiplying the products in 
Table II by 2/(1—7 2): ||). Thus the values in the last matrix of 
Table II are obtained by dividing the values of B[d1.,.] by 0.963045. 
Slightly more accurate values, since they avoid the rounding off errors 
of the B[di,,] terms, may be obtained by multiplying the elements of 
the first matrix of Table II by 0.000519187. 

In this case the first order bounds underestimate the bounds by 
about 3.84 per cent of the first order bounds. More generally the ade- 
quacy of the first order bounds may be measured by the formula 


Bl. 1. : ү 
Blair] 1-32: Xj cyl 


$ 2 

The first order bounds are very unsatisfactory if n У): ?;|c;;| is nearly 
one and they are of doubtful value unless n >; > ;|¢:;| «0.1. Equation 
(5.3) implies that should be less than 1/(10 Уу; 2,;| c; |) if the first 
order bounds are to underestimate the bounds by 11% or less of the 
first order bounds. 

We now prove (5.2). If n is the maximum absolute value of any ele- 
ment of E, then the maximum absolute value of each element of EC is 
given by 


1. 


(5.3) 


А 
п са] 73 eal nE] esl 
"ХІ cal P»! cial ded 722 са) 


"ХІ са! "ХІ са! 2 о» с] 


We note that the rows of В[ЕС] are identical. This is true also for the 
rows of B[EC}?, B[EC}, etc. Now B[EC]: may be found by multiplying 
each element of B[EC] by n >; Хо; В[ЕС}? may be found by mul- 
tiplying each element of B[EC] by 12: 2.;| cu] I^, ete. Finally if 
" «Dsl су) «1 and if we put 


(5.5) k-1-3X Xll el 
i73 


(54)  B[EC] = 
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we have 


22. cal "21 cial "ХІ с] 


ku k k 
» т | eal 221 cal т enl 
(5.6) s| 25 cya] sie k k 


m 


"ХІ cal т». | cial "ХІ сі] 
k k k 


and by taking bounds in (3.42) we get (5.2). 
Table II provides bounds to d;; on the assumption that the maximum 


error in each element of A is n —0.0005. Section IV of this paper provides | 


norms for the case in which 7;; varies from 0.0005 to 0.0015. We could 
use the methods of this section, with 7=0.0015 and find the bound for 
the term having the largest possible error. This is at the junction of the 
row and column of C having the largest absolute sums. Thus from 
Table I, we have at once 


(0.0015) (28.954) (29.376) 
B А ie Азур itc ANA 
[du] 1 — (0.0015) (73.910) к 


which, though obviously larger than the value of 0.441 in Table II with 
n=0,0005, is appreciably smaller than the bounds established in Section 
IV with the use of norms. The methods of Section V are almost as easy 
computationally as those of Section IV and often, as in this case, give 
much more satisfactory results. 

The bounds above are quite general, applying to any inverse matrix 
and to any assumed error bounds, 7;;. For example, the results to this 
point show that the absolute error in си of Table I cannot be greater 
than 0.441 if we assume that each n:;=0.0005, nor can it be greater 
than 1.485 if the maximum 7;; are as stated in the second section of 
Table I. Bounds of this kind can be computed quickly and easily, and 
often should be close enough for our purposes. 

But, in general, it is possible to reduce the bounds in two ways: First, 
by taking account of the signs of error terms; and, second, by using the 
complete error matrix, 7;;, when its elements are unequal. Later sections | 
of this paper determine the lowest possible bounds for the general сазе} | 
—that is, they show the extreme attainable values of d,., taking a¢- 
count of signs of error terms, and making use of the entire т;; matrix 
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Before doing this, however, we shall briefly consider a special class of- 
matrices in which there is no problem of signs. 


VI. А SPECIAL CASE IN WHICH B[d,,] IS AN ATTAINABLE EXTREME 


There are certain special cases in which the bounds of Section У are 
actually attainable, and thus are the best possible bounds when Nij =n. 
One of these is the case in which all elements of C have the same sign, 
ie., the signs are all positive or all negative. 

The case in which the signs of C are all positive is of current interest, 
since it includes the so-called “Leontief Matrices” used in “inter-in- 
dustry analysis” of “input-output studies.” Recent proofs are available 
[25, p. 150] and [9] that the inverse of the Leontief matrix is composed 
entirely of non-negative elements. 

The positive nature of the elements of the inverses of matrices having 
the mathematical properties of “Leontief” matrices was established in 
1907 by Frobenius [8] and in 1920 by Janet [12, pp. 135-41]. Ostrowski 
[18] [19] developed the properties of these matrices in considerable de- 
tail. Though Ostrowski directed his efforts toward the determination of 
the bounds for the determinants, he also gave excellent surveys of pre- 
vious work including the allied topic of bounds for the errors of the 
solutions of linear equations. * 

More recently Taussky [23] has written on these determjnan£s and 
Woodbury [31] has written on the properties of *Leontief" matrices. 
They provide extensive bibliographies. The stability of the inverse 
Leontief matrix has been studied by Woodbury |81). | 

We first illustrate the fact that if the inherent errors associated with 
the elements of a matrix have а common bound, n, the methods of | 
Section V provide the lowest possible bound to the discrepancy terms 
of the inverse. In other words, in the Leontief case B[d,,] is an attain- 
able extreme, sj 

In actual practice, inter-industry analyses are usually made by the 
use of Leontief matrices of large order. The Bureau of Labor Statistics 
has constructed such a matrix of order 450, and plans to invert one of 
order 200. However, the principle can be shown by considering the 
simple 3-rowed Leontief matrix in Table III. : 

) We note that the computed inverse, C, is composed entirely of posi- 
tive elements. The discrepancy bounds B{d,.] are computed by (5.2) 
ning the row-sums and column-sums of C, and assuming 10.005. 
These discrepancies could be actually reached in the extreme case in 
Which the error of each element of 4 is negative. This is shown in the 
final two sections of the table. First we adjust A by subtracting 0.005 
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from each element, then invert the adjusted matrix. The differences 
between the inverse of the adjusted matrix and the inverse of the origi- 
nal matrix are precisely B[d,,] as computed, aside from an error of one 
digit in the first element of the second row and the second element of 
the third row. These errors are due to rounding of both inverses to two 
decimals. 

The principle illustrated with this simple numerical matrix holds for 
any Leontief matrix, no matter how large, as is shown in Section VIII. 
The B[d,,] bound provides extreme values of the discrepancies which 
would be reached if all inherent errors were at their maxima, and all 
were negative. Therefore, the bound cannot be improved in case of a 

‘Leontief matrix with a common bound to the inherent errors. 


TABLE III 
Leontief matrix, | 1.00 —0.50 —0.30 
А=| —0.40 1.00 —0.30 
—0.20  —0.30 1 4 
Absolute row-sums 
Computed inverse, 1.56 1.01 0.77 3.34 
eom 1.61 0.72 3.12 
0,55 0.68 1.87 2.60 
Absolute column-sums 2.90 3.30 2.86 E E |a| 29.06 
Discrepancy bounds, as- [0.05 0.06 0.05 
suming n —0.005 ls 0.05 0.05 
В[а,.] -L 0.04 0.04 0.04 
Adjusted matrix 0.995 -0.505  —0.305 
B = [a;; —0.005] Е 0.995 а) 
—0.205 -0.305 0.995 
Computed inverse of B 1.61 1.07 0.82 
Ё .88 1.66 9.77 
0.59 0.73 1.41 


УП. А LEONTIEF MATRIX WITH UNIFORM PERCENTAGE ERRORS 


In some practical cases, the bound of an inherent error y;; may be 
some proportion of а;;. For example, we may set the bound 


(7.1) $ Mii S Ка; 0<&<1 


Consider а Leontief matrix, with inherent errors bounded by (7.1). 
In this саве we can determine a very simple bound to the discrepancies | 
in the inverse. Since each element of the inverse is positive, the extreme 
discrepancies will occur when all errors are negative. In this case, the 
“true” matrix would be 


> б б. 
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au(l— k) аш(1— k) +++ a, (1 — k) 
an(l — k) as(l — k) *-- as (1 — k) 


а((1-8) a4 — k) - + - a (1 — №) 


(7.2) 


and more briefly, 


(73) , T = A(I — kl). 
The inverse of the true matrix would be 
I T 
7.4 T = —— А = — 0; 
(4) 1-% 1-5 


which would be obtained by multiplying each element of C by 1/(1— k). 
"Therefore, the maximum discrepancy in this case is 
C. k 

Blan] = = On = 0. 

Thus, if the maximum inherent error asgociated with any element of 
а Leontief matrix is 5 per cent of that element, the maximum dis- 
crepancy associated with any element of the inverse is 0.05/0.95 times 
that element of the inverse. This is the closest bound which could be 
established, since the discrepancies are extreme values which could be 
reached if the errors were distributed in the worst possible way. The 
bound stated in (7.5) holds for a Leontief matrix of any order and is 
extremely simple to apply. ` 


e 
VIII. LEMMAS CONCERNING THE EXTREME DISCREPANCIES 


The discussion of the earlier sections is concerned with bounds for the 
first order discrepancy matrix and the discrepancy matrix. Aside from 
the methods of Section VI and Section VII, which are applicable to 
Special matrices, there is no guarantee that the bound can actually be 
attained. In this and the sections of the paper immediately following we 
Work out somewhat more complicated techniques for finding the best 
possible bounds, i.e., smaller bounds than those described above and 
bounds which cannot be replaced by yet smaller bounds because these 
can actually be attained in some extreme problem. These discrepancy 
bounds we call “extreme discrepancies.” The theory also enables us to 
Produce the matrix, АЕ, which would maximize the discrepancy be- 
tween а particular element of А-1 and the corresponding element of 


2 е 
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(А+ E). The derivation is somewhat more involved than is the appli- | 1 


cation, which is very satisfactory for the case of common bounds. 

"The solution is based on two lemmas, proved below, which establish 
the fact that a specific sign can be assigned to each element of the error 
matrix, once the ¢;; are known, so as to obtain the extreme discrepancy 
for any given element of the inverse matrix. The first lemma deals with 
extreme first order discrepancies and the second with extreme total dis- 
сгерапсіев. With the signs determined, it is possible to find the dis- 
erepancy matrix which results from the specified errors in several ways. 

We know from (3.9) that the first order approximation to the dis- 
crepancy matrix is given by — СЕС. Formal expansion of this product 
shows that the value of the element in the rth row and sth column of 
the first order discrepancy matrix is — C,EC, where C, is the vector of 
elements in the rth row and C, is the vector of elements in the sth col- 
umn. The first order discrepancy for the element in the rth row and sth 
column of the inverse can then be written 


(8.1) dis = — 2 Ж Се: Са. 
ДЫЙ; 


We are free to give the signs to the єг; so as to obtain the extreme 

dia. If we take the sign of vi; as opposite to that of the product criti 

then di;rs is as large (positively) as is possible. Similarly if the sign of ei; 

is taken as that of с.с, dir. takes on its extreme negative value. We 
- introduce the definitions 


0; = l — when с, has the same sign as cj, 
(83) б; = — 1 when c, has the opposite sign to that of с) 
0; —0 чепс, or c,;—0, 
eij = бу) where є; = 0. 
Then (8.1) becomes 


(8.3) ав] гы 25 25 Cri ij€ijC in 
mm 
where L[d1:n] indicates the smallest possible (attainable) value of dure 
Similarly if 0/;— —6,, we have 
(8.4) LT [dure] = X онбес = D У) cries 
i i 4 J 
is the largest positive value of dins. 


The extreme first order discrepancy of a given element is not neces- | 
sarily associated with the extreme first order discrepancy of another 
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element. Theoretically it is necessary to fix the signs of each e; т order 
to obtain the extreme discrepancy for some one element. Sometimes 
these signs will also maximize the discrepancies associated with several 
other elements, or even of all elements. Thus if the values of сх; are all 
positive, all 0;;— 1, 0;/ = — 1 no matter what the values of r and s. If the 
eij take on values as large as ту in absolute value, (8.3) and (8.4) be- 
come 


Ldn] = — 2757 сыбыс» 


8.5 x 
29 Ра „| = 23 > Сг; Суу 


while if у;; =, we get 
L[d.4] = — т>) 32 cóc 
FEE 


8.6 
i? Па] = ә ХХ etae. 
155 

We now state Lemma 1. The extreme first order discrepancies of Crs are 
given by (8.5) and (8.6). 

Once the signs of the e;; are determined, the values of the other ele- 
ments of the first order diserepancy matrix, though not necessarily ex- 
treme discrepancies, may be computed with the formulas < ” 


dh; pq = F > % ср iim iiCja 
404 


(8.7) 
di. n= + ЭЭ >, [o 
$25 
We next consider the question of finding extreme discrepancies rather 
than extreme first order discrepanties. Using (3.5) and (3.5)! we get 


(8.8) D-TA3—Q-TOEC- — СЕТ, 


If we knew the signs of the elements of T~ we could determine the 

Signs of E so as to obtain an extreme value of drs, just as we did of 

dire above. We do not know the value of T~, but if the relative errors 

ато small, the elements of T— are close to those of C. If the relative errors 

are small enough so that the signs of the elements of T— are identical with 

the signs of the elements of C, then the extreme discrepancies occur simul- 
taneously with the extreme first order discrepancies. This is Lemma 2. 

: One does not know, of course, at the beginning that this condition of 

identical signs is satisfied. But he computes the greatest possible dis- 

crepancy for each element, with the given bounds for e, and knows 
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that he has the extreme discrepancy for each element if the discrepancy 
is less in absolute value than the corresponding element of C. 

The calculation of this potential extreme discrepancy is accomplished 
in various ways, as is shown in the sections following. We wish to indi- 
cate here that the value of this extreme discrepancy element 


(8.9) ЦаД--ЦаһҺП- EC + (EO? – -- + ] 


is obtained by using the same Z in the series which is used in obtaining 
L[di.,] from СЕС. This formula can also be written as 


(8.10) D = - СЕС + CECEC — CECECEC +... 


where Е is taken so as to obtain extreme discrepancies for d,s. 

We now return to the problem discussed in Section VI. We know that 
the elements of the inverse matrix are all non-negative so the signs of 
all ei; terms are taken as non-negative. The extreme negative first order 
discrepancies are obtained by setting е; = ту. When 7:;=7, the absolute 
values of the first order discrepancies become у > „уе; ;с. Since the 
elements of 6 and those of the inverse matrix are non-negative, this 
becomes ласа Уусу and this is the equivalent of (5.1). Hence (5.1) 
gives the absolute value of the extreme first order discrepancies. 

If all the extreme first order discrepancies are smaller than the corre- 
sponding values of c, the formula (5.2) gives the extreme discrepancies, 
as is illustrated in Table III and as clarified in sections following which 
treat also the case of unequal bounds. 

The Leontief matrix is not the only special case in which all maximum 
discrepancies are attained at once. Any matrix having an inverse whose 
elements alternate in sign also has this property. More generally, if the 
signs of each row (column) of the inverse matrix are equal to (or the 
negative of) the signs of every other row (column), the matrix takes on 
extreme values, positive or negative, simultaneously. Thus the elements 
of the matrix having an inverse with signs 


fae ы aL LA 
5 P E É 
ar = — СЕ 
ЗЕ cw = + 
take on extreme values simultaneously. 


Some elements may take on extreme values simultaneously even 
though not all the elements do, if the signs of the corresponding rows 
(or columns) of the inverse matrix are identical or opposite. Thus the 


} 
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extreme negative discrepancy of ам in Table I occurs simultaneously 
with the extreme positive discrepancy of aa. 


IX. THE COMPUTATION OF THE INVERSE OF AN ADJUSTED MATRIX 


Once we know just what are the values of E which give extreme dis- 
crepancies, we can avoid the expansion of a series suggested in the last 
section by using techniques which have been worked out to handle 
adjusted matrices [5, p. 292]. An adjusted matrix is one in which some 
element (or elements) is replaced by some other element (or elements). 
Formulas featuring determinants were worked out by Sherman and 
Morrison [21] [22] and Woodbury [15] for the case of adjustment of a 
single element and for the case of the adjustment of the elements of a 
column or row. These adjustments have been described in matrix form 
(5, р. 293]. A more general type of adjustment has been worked out by 
Bartlett [1] in which all elements of the matrix may be adjusted if the 
matrix of adjustments can be expressed as the product of two vectors, 
Still more general formulas for the adjustment of matrices have been 
given by Woodbury [30]. The attention of the reader is also called to 
the earlier (1936) work of Wittmeyer [29] who used max; [№ (АА)? 
and min; [\;(A’A)]"in obtaining bounds for the errors in adjusted sys- 
tems of equations, and to the recent worle of engineers [28] [16] [3]. 

The two general formulas given by Woodbury, which are slightly 
altered here to parallel the formulas of Section 3, are v 


(91) D = (4 + USV) — A93 = — CUS(S + 8ҮС08)-8УС 
or if S is non-singular, 
02 D= (4 – USV) – 43 = CU(S — YCU)3YC. 


These are too general to be useful to us here in their general form, 
though special cases aré very useful. For the general problem it might 
be more practical to calculate (А--Е)-, when E is known, by the 
Method of the next section rather than to follow the computation of 
(S—VCU)-, or of (S-+SVCUS)-, by the additional matrix oper- 
ations demanded by (9.1) or (9.2). 

These formulas are very useful for the special case in which © is a 
scalar, say c. Then и is a row vector, v à column vector and 


(03 D=(44 моо) — A71 = — Que(s + с0Сио) 1090 


becomes 
сСшС 


(9.4) = кш узы ae eee ie 
DO Th 1 + ovCu 
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Proof of (9.4) follows by pre-multiplication or post-multiplication by 
А+Е = A-F-ucv. Following the style of Bartlett [1], one may arrive at 
this formula by means of formal substitution in (3.4). 

"This formula for the inverse of an adjusted matrix is very useful in 
obtaining the extreme discrepancy matrix when the errors of the ele- 
ments of the original matrix have common bounds, as is shown in Sec- 
tion XI. 


X. THE RECALCULATION OF THE INVERSE 


The equation (3.2) rather than (3.4) might be used once the values 
of the error matrix which give extreme discrepancies are known. This 
method calls for the calculation of T--—(4--E)- and thence 
D-T-1—C. It is extremely simple in theory and is especially useful 
where machines are available for inverting matrices efficiently. Ideally 
the reduction method should be some exact method such as the method 
of determinants [5, p. 76] though good approximation methods would 
be satisfactory, particularly if corrected to а given number of decimal 
places by the method of Section XIII. It may be necessary to make a 
recaleulation for each element of the inverse for which it is desired to 
obtain the extreme discrepancy. 


хі. EXTREME DISCREPANCIES FOR THE CASE OF COMMON BOUNDS 


We are now in a position to improve the treatment of the case dis- 
cussed in Section 5 in that we can produce extreme discrepancies. We 
have three different means of calculation at our disposal. These are 

(a) the method of expansion in series, 

(b) the method of adjusted matrices, 

(c) the method of recalculation. : 

We also discuss (d) the treatment of joint extreme elements. We discuss 
them in order. Application is made to the problem of Tables I and П 
with т —0.0005. 

(а) The method of expansion in series. In this case we use the matrix 

expansion (8.10) and compute the matrices CEC, CECEC, CECECEC, 
+++, until convergence is attained to the desired number of places. Ё 
74-7 and Е =10', the value of D is 


GLD D= [сес] + [есес] + тІсесесесі-... 


The illustration of this method is presented in Table IV, where the 
extreme discrepancy for Си and the other associated discrepancies 876 
computed. The value of the matrix 0’ is obtained by premultiplying the 


> 
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column vector of signs of the fourth column of C by the row vector of 
signs of the fourth row of C. This result is placed at the top of Table IV. 
The three decimal place value of C is placed below 6’ below which ap- 
pears 0'C. The values C0'C, СӨ'СӨС, C6'C0'C0'C are computed and re- 
corded below. The corresponding elements of these matrices are then 
multiplied respectively by т, 27, 7° and the results are added to obtain 
the discrepancy matrix. The elements of this discrepancy matrix are 
less in absolute value than the elements of the inverse matrix above so 
we may say that extreme discrepancy of the du term is 0.438. This is 
slightly less than the 0.441 obtained in Table II. If т were larger, the 
improvement would be greater. 

For this problem the du term is somewhat smaller in absolute value 
when the values of 0;; are used in place of 6;/. Thus if E = —7@’ instead 
of 6’, (11.1) becomes 


(11.2) D = — C0'C + (—n)2C0'C0'C + (—n)2C0'CO'C0'C + --- 
and the values are those of sums of alternating series. Thus in Table IV 


du = — 0.0005(850.553) + (0.0005)2(50504) 
- (0.0005)*(300.10*) + .-- = — 0413. 


(b) The method of adjusted matrices. Hie 
? The value of E must be taken, in order to obtain the extreme r, s term 
in the discrepancy matrix as 


Е = [eu] = uw 


Where v is a column vector having elements with absolute value unity 
with signs of с v is à row véctor with unit elements, aside from sign, 
and with signs identical with those of c.. The sign of the scalar n is not 
Yet determined. In order to attain the extreme value, positive or nega- 
tive, the sign of т should be taken opposite to that of vCu so that the 
denominator of (9.4) becomes as small as possible. If 0Сш is positive, 7 


should have a negative sign and the formula for the discrepancy matrix 
is 


(11.3) р--, ШО у. 
1- тСи 1- mCu 


Now since the numerator of (11.3) is the value of D; as found in Section 
ІП, we need only calculate 1—7vCu and divide it into D; to find the : 
Values of the discrepancy matrix associated with some extreme dis- 
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TABLE IV 


6! 


9'C 


сес 


Ce'Ce'C 


P 79 


Ce'Ce'Ce'C - 10-4 45 


0.101 
0.004 
0.001 
—0.104 


m 
=l 
E 

1 


—13.626 
—0.285 
—0.147 

14.896 


— 28.954 
— 28.954 
-28,954 

28,954 


— 827.308 
—32.747 
—8.628 
850.558 


—49,124 
—1,944 
-512 
50,504 


—292 
—12 
-3 
300 


—0.426 
—0.017 
—0.004 

0.438 


crepancy element. The method is illustrated in Table V where, again, 
it is desired to obtain the extreme value of the du term. The value of 

is recorded and the column sums are recorded after each element has 
been multiplied by the sign of the corresponding element in the fourth 
column. Corresponding row sums are obtained after the elements have 
been multiplied by the signs in the fourth row. These marginal entries 
are summed after multiplication by the signs of the fourth row (ot 
fourth column) to get иСи = 59.378. Since vCu>0, т is given a negative 


E E 


| 
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sign. The value of С@'С is then obtained by multiplying the marginal 
entries, as in Table II. The results agree with the results of the more 
formal methods of Table IV. 

The values of D; are then obtained by multiplication by 0.0005 and 
the values of D by dividing the values of Di by 1— (0.0005) (59.378) 
0.970311. The resulting values are in complete agreement with the 
results of Table IV. They are obtained much more easily with the 
method of Table V. 

We should note further that the other set of extreme discrepancies 

_(having signs opposite to those of this set) are obtained by using the 
multiplier 


=e — 0.0005 ..—0.0005 
1+ Cu — 14-(0.0005)(59.378) 1.029689 


The value of the extreme negative du term, for example, is 


(— 850.553) (0.0005) 


= — 0413, 
1.029689 
ав shown above. 
TABLE V 
METHOD OF ADJUSTED MATRICES ^ 
6.483 5.153 3.311 —13.626 —28.573 
—4.798 6.485 —0.841 —0.285 1187 
С 0.694 —1.627 1.084 —0.147 —0.298 
—2.086 —9.095 —3.299 14.896 ‚29.376 
—4.465 —19.106 —6.853 28.954 59.378 
s 
| 127.578 545.916 195.811 827.303 
сес 5.050 21.609 7.751 —32.747 
1.331 5.694 2.042 —8.628 
—181.164 —561.958 ^ —201.314 850.553 
0.064 0.273 0.098 —0.414 
Di 0.003 0.011 0.004 —0.016 
0.001 0.003 0.001 —0.004 
—0.066 —0.281 —0.101 0.425 
» 0.066 0.281 0.101 —0.426 B 
0.003 0.011 0.004 -0.017 
0.001 0.003 0.001 —0.004 


—0.068 —0.289 —0.104 0.438 
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(c) The method of recalculation. The method of recalculation, de- 
scribed in Section X, is illustrated in Table VI. The matrix of error 
bounds, with signs assigned so as to maximize the 444 term, is first 
exhibited. This matrix is then added to A to obtain the specific A+, 
The inverse is then computed. This computation is not exhibited in 
Table VI but the actual calculation used the method of determinants 
so that the answers can be guaranteed to four decimal places. The dis- 
crepancy matrix, formed by subtraction of the four place value of A“, 
is recorded to three decimal places. It is identical with the three-place 
discrepancy matrix of Table IV and Table V. 


i: TABLE VI 
METHOD OF RECALCULATION 


—0.0005 —0.0005 —0.0005 0.0005 

E —0.0005 —0.0005 —0.0005 0.0005 
—0.0005 —0.0005 —0.0005 0.0005 

0.0005 0.0005 0.0005 —0.0005 

0.4945 0.2745 0.0975 0.4595 

А+Е 0.4355 0.4485 0.2635 0.4105 
0.3945 + 0.5565 1.3185 0.3855 

; 0.4235 0.4365 0.4675 0.4665 
6.5488 5.4340 3.4122 —14.0519 

(А--Еу- —4.7956 6.4966 —0.8369 —0.3015 
0.6944 —1.6243 1.0856 —0.1518 

-2.1539 —9.3842 -3.4025 15.3346 

6.4831 5.1527 3.3112 —13.6256 

С-Азі —4.7982 6.4853 `  —0.8408 —0.2846 
0.6937 —1.6270 1.0845 —0.1474 

—2.0863 —9.0950 —3.2987 14.8963 

0.066 0.281 0.101 —0.426 

р ^ 0.003 0.011 0.004 —0.017 
0.001 0.003 0.001 —0.004 

—0.068 —0.289 —0.104 0.438 


The extreme negative value for du is obtained by this method by 
taking the signs of E opposite to those of Table VI. Another recaleula- ` 
tion is then necessary. 

(d) Extreme discrepancies associated with the elements. Frequently we 
desire the extreme discrepancy associated with each individual element, 
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rather than the extreme discrepancy of some particular element. This 
can be done, for the case of common bounds, by the method of adjusted: 
matrices but the values of the extreme C@C and 1--тоСи must be calcu- 
lated for each element unless we have the fortunate situation, charac- | 
terized by Leontief matrices, where the extreme discrepancies of many 
elements are taken on simultaneously, as illustrated in Table III. 

The technique is illustrated in the general case in Table VII where 
1=0.0005. The columns of C are added after multiplication by the 


TABLE VII 
TREATMENT OF JOINT EXTREME ELEMENTS 
— 

6.483 5.153 3.311 —18.626 28.578 8.9085 18.207 —28.573 28.578 
—4.798 6.485 —0.841 -0.285 1.131 12.400  —11.889 -1.181 12.409 
0.004 —1.627 1.084 —0.147 0.298  —8.258 3.552 —0.298 3.552 
—2.086 —9.095 —3.299 14.896 —29.376 - -18.606 —11,186 29.376 29.376 , 
14.061 6.136 8.535 —28.884 57.116 11.924 44.844 -57.16 
3.077 22.360 4.085 —28.660 58.782 43.258 14.062 —58.782 
14.061 6.136 8.535 —28.384 57.116 11.924 44.844 —57.116 
—4.465 —19.106 —6.853 28.954 —59.378 --36.742 -<21.166 59.378 
14.061 22.300 8.535 28.954 73.910 


401.765 638.892 243.871 827.308 0.971442 0. 0.971442 --0.970811 
174.483 277.465 105.911 359.290 0.994038 0.978371 0.994038 —0.981629 
49.945 79.423 30.316 102.845 0.977578 0.992969 0.977578 —0.989417 
418.056 656.847 250.724 850.553 —0.971442 —0.070609 —0.971442 0.976311 "0.963045 


0.207 0.329 0.126 -0.426 
0.088 0.142 0.053 —0.183 
0.026 0.040 0.016 —0.052 
70.218 -0.338 —0.127 0.438 


signs of each specific column of C to get the matrix below C. Thus 
14.061 is the sum of the elements of the first column of C when multi- 
Plied by the signs of the elements of this column, 3077 is obtained by 
adding the products of the elements in the first column of C with the 
Signs of the elements in the second column of C, ete. The elements of 
the rows of this matrix are then multiplied by the corresponding ele- 
ments of the rows of C and added to get the respective values of 000, 
Which are placed in a matrix at the right of this second matrix. These 
values are checked by the matrix at the right of C which is obtained , 
With the use of row sums rather than column sums. These four matrices 
are bordered with the sums of the absolute values of the elements in 
the rows and columns of C, and with the sum of the absolute values of 
^ij Which is presented in the lower right. This portion of the table eorre- 
ponds to а part of Table IT. 
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The values of the extreme elements of СӨС for each value of e; are 
then computed by multiplication of the diagonal term of the row sum 
matrix by the diagonal term of the column sum matrix, or what 
amounts to the same thing, the values of У; |с,:| У); |с] are com- 
puted and agree with those of Table П. The values of 1+7vCu are then 
computed and are placed in a matrix at the right of the CoC matrix, А 
transposition of the (vCu) matrix is effected so that the elements of the 
recorded matrix correspond to the elements of the СӨС matrix. 

The actual values of the extreme discrepancy matrix are then com- 
puted. A negative sign is assigned to certain elements since, in this 
case, vCu is negative and the element of the discrepancy matrix must be 
negative in order to attain this extreme value. 

In this problem, since 7 is small, these results are not much better. 
than those of Table II, and they involved considerable additional cal- 
culation. In situations such as this, the method of Table II is to be 
recommended. In other situations, particularly where 7 is large and 
when the values of vCu differ appreciably in absolute value from 

DX: У; 03|, the method just described usually gives much better re- 


sults. 
XII. EXTREME DISCREPANCIES FOR THE CASE OF UNEQUAL BOUNDS 


` The methods of the last section are useful in treating the case of 
common bounds but they are not all so useful in handling the general 
case of unequal bounds. In particular, the method of adjusted matrices 
indicated in the last section is not usually applicable since the E matrix 


cannot usually be written in the form uCv. "Though the matrix of ex- 
treme errors 


0.0001 0.0002 0.0003 
E = | 0.0002 0.0004 0.0006 
0.0003 0.0006 0.0009 
can be written as 
1 
E = | 2 |[0.0001][1 2 3], 
3 


the matrix of errors of Table I cannot be so written, What can we do 
in this case? 
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One answer to this question is to replace each 7,; by the largest 
т=п and then use the methods of the last section. We might take 
n=0.0015 rather than 0.0005 in Tables IV, V, VI and VII and thus 
obtain answers to the problem of Table I. For example, we get from 
Table V à 

(0.0015) (850.553) 


“eee 
1 — (0.0015) (59.378) 


These results give bounds, rather than extreme values, and we do 
not know how much these bounds might be improved. We are still 
faced with the problem of finding the extreme discrepancies. 

Perhaps the simplest method, at least in theory, to use is the method 
of recalculation. The signs of the error terms are determined as in the 
last section. Using the 7; of Table I we find that the inversion of 


0.4940 0.2745 0.0975 0.4600 
0.4350 0.4480 0.2635 0.4110 
0.3940 0.5560 1.3175 0.3860 
0.4240 0.4370 0.4680 0.4680 


е 


А+Е = 


results in 

(A--E)473—15.3156 so that subtraction of Си=14.896 gives du 
70.419. This is considerably smaller than the value, 1.401, obtained 
with the method of common bounds. 

This method of recaleulation may be considered impracticable by 
some, and some alternative’ method may be preferred. However, it 
should be mentioned here that this method of recalculation may be- 
come preferable to other methods as improved techniques of calculation 
are made available and as machinery is developed for computing in- 
verse matrices quickly. 

Alternatively we may use the method of expansion in series. The 
computation is somewhat more complicated here, however, for the size 
of the bound of each element must be taken into consideration at each 
Step. Since the values of the errors are usually of the same order, we 
find it convenient to use a measure of this order as a scalar multiplier. 
Thus the matrix of errors of Table I, with the signs attached so as to 
obtain the extreme positive value of ди, can be written 


, 
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2 1 10:2 
2 2 1 -2 
(12.1) Е-- жата; [0.0005] 
т2 -2 -2 2 
ог 
(12.2) Е = Hs. 


where n is the unit of measurement. 
Тһе series then becomes 


(12.3) D = nCHC + CHCHC + CHCHCHC ..., 


and the expansion is carried out as in Table IV. 

The method of adjusted matrices, as described above, is not satis- 
factory in this case since in general the extreme error matrix cannot be 
written as the product of two vectors. 


XIII. MAXIMUM EFFECTS OF COMPUTATIONAL ERRORS 


Many methods of computation do not lead to exact values of А-!- C, 
or even to values which сап фе guaranteed to a fixed number of decimal 
places, but rather to values which are admittedly approximations to С 
and which; it is hoped, may be used in place of C. If the approximate 
matrix is denoted by Co, what discrepancies between Co and C are per- 
missible? 

Ina paper mentioned above, [26], one of the authors used the method 
of submatrices in arriving at a five decimal place approximation to C. 
This approximation, rounded to three decimal places, is 


6.485 — 5.152: 3.311 —13.625 728.573 
—4.798 6.485 —0.841 - 0.285 |12.409 
0.694 —1.627 1.084 — 9.147 | 3.582 
—2.086 —9.095 —3.299 14.896 |29.376 
14.008 22.359 8.535 28.953 73.910 


(The sums of the absolute values of the row and column elements are 
placed in the margins for later use.) Is Cy close enough to C so that it 
can be used in answering questions of the type discussed іп the earlier 
sections of this paper? Can a bound for the computational errors be 
found? Can we find a matrix of computational errors which, when 
added to Co, gives С to three decimal places? , 


(131) С- 
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The question of the accumulation of rounding-off errors has received 
extensive attention in the literature of the last decade. A complete 
survey of this material is hardly appropriate to this article but a few of 
the techniques should be mentioned. Hotelling [11] proposed that an 
approximation Co might serve as the basis of an iterative method which 
continues until the result is correct to the desired number of places. 
Satterthwaite [20] has proposed a type of error control in which the 
norm is reduced to a very small amount by matrix multiplication; von 
Neumann and Goldstine [17] have studied the accumulation of compu- 
tational errors in what we call the method of single division [5, p. 99]; 
Turing [24] studied the accumulation of errors of several common re- 
duction methods using the maximum coefficient. Hotelling [10] pro- | 
vided formulas using norms which give bounds to the difference be- 
tween C and C, and between C and Cm where Cm is the improved ap- 
proximation to A— resulting from m iterations. Following the style of 
Hotelling, but not his precise notation, we write the matrix of compu- 
tational errors as | 


(13.2) А = б — Cs = C(I — A0) = CG. 
Using norms we get 
(13.3) N(A) < N(O)N(G). ie 


Unfortunately we do not know (С) but we сап develop an upper 
bound for it if N(G) <1. Since 


434) C= OAC] = esr — @] = о + G G2 9 s] 


we have 


(08.5) д=С- с = б[@ + G 68+] 

80 5 

(13.6) NC) 
NO 

(13.7) Na) = TON, 

Where k- N(G). 


This formula is given by Hotelling |10, p. 281]. A similar formula using 
M(A) is given by Turing [24]. In our notation his formula is 
nM (C)M(G) 3 


(13.8). М(а) = i ud 
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We apply (13.7) and (13.8) to the problem above. We compute 
G-I—A0O; and get ` 

—1163 436 339 —108 7 2046 

—1114 441 427 — 87 |2069 

—1365 403 911 -—447 | 3126 

—1163 418 528 -148 |2257 


(13.9) 108.6 = 


with 
М№(Со) = 25.598, МьСь) = 29.876, M(Co) = 14.896 
(13.10) (Со) (Со) / (Со) | 
М№(@) = 0.002860, М.(6) = 0.003126, M(G) = 0.001365 
во 


(13.11) М.(А) S 0.073, (4) = 0.092, апа M(A) < 0.082, 


and it follows that no element of Со differs from the corresponding ele- 
ment of C by more than 0.073. This is less than many of the extreme 
errors attributable to inaccuracies of the original data (sce Table VII) 
when 7 —0.0005 so there may be little point in making closer approxi- 
mation to A~ unless it is possible to state the original matrix more ac- 
, eurately. > 

If the approximation, Co, were not so good, i.e., if N(A) were con- 
siderably larger, it might be wise to obtain a closer approximation to C. 
Hotelling suggests 


(13.12) €; = С, [21 — AC] = Co + CoG 
as the next approximation. Then 
(84) A&-C-6-0C—-0-66G-4- 0G 


and the difference between the approximations is CoG. : 
This process can be extended. If it is extended through m iterations 
we can arrive at Hotelling's formula 


N(Cgk?" 

1-k 
Hotelling points out that the 2" exponential of k means that the number 
of sure decimal places is doubled with each iteration. 


Sometimes we wish to provide the A which, when added to Co, gives 
the value of C to a specified number of decimal places. Though this 


(13.14) N(A,) = 
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сап be done by iterative methods, such as the method proposed by 
Hotelling, it can also be accomplished with the use of (13.5) if Co is 
close enough to C so that the series converges. The method is effective 
when Со is a good approximation to C, as in this case one needs to 
compute but a few terms of the series to obtain the correct values. This 
situation usually exists when direct methods or build-up methods are 
used in obtaining Со. 

In many cases, а single term of the series will be sufficient. This is 
true in the illustration above since CoG?, CoG, - - - are too small to 
make any contribution to the third decimal place. It follows that 


—0.002 0.001 0.000 —0.001 
0.000 0.000 0.000 0.000 


0.000 0.000 0.000 0.000 
0.000 0.000 0.000 0.000 


(13.15) A = CoG = 


Addition of A to Со gives the value of C correct to three decimals. 
The formula for (13.5) can be used in another way. Following the 


type of argument of Section 5 for inherent errors, we let 01, gs, * * * Yn 
be the maximum absolute elements in colurfins 1, 2, · - - , n of G. Then 
gı 0 ... Qn i F 
(13.16) Во 
9 itt 9 


Each row of B(G?) is gı Dog " Doge: 5, gn 22g. Each row of B(G”) is 
ol Xa)”, @( Deg), - - - , gal Dag)? 


The matrix $77. B(G)7) is composed of the rows 


91 9: Lis 
7 ит: 
ТҮСІ Ey fos Уу {К = Dg 
80 that 
Gs 
(13.17) extreme A,, = P» Cos TESNA 


Application of (13.17) using (13.1) and (13.9) yields the result 
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0.039 0.013 0.026 0.013 } 
0.017 0.005 0.011 0.006 
С ее 040 0.002 
0.040 0.013 0.027 0.013 
which, while not as good as (13.15), is much better than the results of 
(13.11). 
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А MULTIPLE GROUP LEAST SQUARES' PROBLEM AND 
THE SIGNIFICANCE OF THE ASSOCIATED 
ORTHOGONAL POLYNOMIALS 


Braprorp Е. KIMBALL 
New York State Department of Public Service 


A multiple group least squares’ problem is presented and 
solved which involves determination of certain coefficients 
separately for each of several samples with a final coefficient 
given a single optimum value for all samples grouped to- 
gether. 

The computation of orthogonal polynomials implicit in the 
normal equations is discussed relative to the forward solution 
of these equations. Certain significant advantages of comput- 
ing orthogonal predictors are pointed out. 

A numerical example of the computation of the orthogonal 
polynomials in the case of a weighted fit is presented with ap- 
plication to determination of the confidence band of a fitted 
trend curve. 


I, A MULTIPLE GROUP LEAST SQUARES’ PROBLEM 


e is a cost-distance relationship is sought in а study of motor 
carrier costs. Cost data from samples of several companies are 
availahle but they are from different regions and apply to different 
weight ranges of shipments. A quadratic polynomial distance function 
is to be fitted by least Squares, where the distance variable 2 is taken 
as the square root of the distance, and the dependent cost variable is 
cost per 100 Ibs. 

There is reason to take the coefficient of the squared term as con- 
stant over all samples, while the linear coefficients are to be fitted 10 
the separate samples. How does one solve the problem? 

The answer is furnished by the following analysis: In the interests 
of simplicity limit the problem to three samples. One then seeks three 
cost-distance functions of the type 


UD = а 4 Os + уд? 
(1.1) uD = aO 4 Вх + yg? 
UD = gO) 4 BOg 4 yr? 


which give a best fit to the data in the least squares’ sense. 
The costs occur as three series yi, s=1, 2, 3 with N© values іп the 
sth series. The moments are designated by 


(1.2) аһ” = Sleg] age = Я [ен у] 
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where 
aj = 1, 230 = 219), 249 = (aj)? 


Thus the matrix of the normal equations for each sample considered 
separately will be 


(1.8) laa? ax as, ail, i = 1, 2,38. 


For each sample we first neglect the squared term in x and obtain 
solutions 


(1.4) U® = B® + By?z 
of the normal system 

(1.5) |049 ai, a|], i= 1,2. 
As the next step determine a pair of solutions А10), A;(? of the system 
(1.6) |069 an, ais], 4-1,9 


Where the moments of the dependent variable on the right have been 
replaced by the moments involving 23. 

It will be found that the first two equations of the system (1.3) are 
satisfied by the three coefficients 


B® — 1410, B® — tds, t 


e * 


for any value of t. Thus the two following equations 


an9(B,O — LA,0) “ан 9 (B9 — 14,9) + аш = aro 


(1.7) 
aay (B® — tA) + as 9 (By) — 14,9) + ау = аш” 


will be satisfied for each*sample, for any value of t. Then the quadratic 
curve, for any constant value of t, р 


u(t) = B? — 1410 + (Bí? — 14:09) + ta? 


Will have linear terms which will.be the optimum terms in the least 
Squares' sense for each s for the sample in question. This means that 
(L8) — a8/a(B,© — 14,9) = 989/9(В,9 — 1459) = 0 


Where S? denotes the sum of the squares of the residuals for the sth 
Sample. 


Thus our problem will be solved if can be chosen to give an opti- 
mum value for all the samples. This optimum is taken as the value 


= Й 
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# 
which will minimize the sum of the squares of the residuals over allthe | 
samples lumped together. 4 

Now for a single sample the total derivative of the sum of the — 
squares of the residuals with respect to t сап be shown to be 
i 


(1.9) dS9/di = — (Y; — 4:00) 
where 
A i qs = ав® — Aya, — Aga 


Ys = ag — Bay — Bea, 9 
Using (1.8) this is proved as follows: Omitting the superscripts, Write 1 
u(t) = P(t) + 204) + и, 
РЧ) =Bi— 1А, Q(t) = В, — 1А, 
and the total derivative of S as to £ is given by 
dS 05 dP 98 40 98 
a Pa OQ at ә 
where the last partial derivative involves { only as coefficient of a* in 
u(t). From (1.8) 9 
о 95/9Р = 98/90 = 0. 
Hence 
45/@ = 95/94 = 25^ (и — yjz? = — (F ry — Y; xw) 


which easily reduces to (1.9). 
_ Now note that (cf. (1.7)) 


-(ғ9- qt) 
= aa 9 (By — 14,9) + as 9 (B, — t 4,9) + (asst — ам), 


Using (1.9) the derivative of the sum of the squares of the residuals 
п over all samples is given by summing the expression on the right 
over all samples with ¢ constant. Setting this equal to zero 


Y, + YyO + y,o 
BP + ge + а ; 

Henee the required coefficients of the quadratics (1.1) are given by 
(1.13) а = B® — y, 4,0, B® = B® — yA, 
there *y 18 determined by (1.12). 


(1.11) 


(1.12) prem 


. = 


Ё 
С 
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П. ANALOGY WITH METHOD OF ORTHOGONAL POLYNOMIALS 


Light is thrown on the problem of finding the sampling variance of 
y by bringing out the orthogonal relations inherent in the method. The 
method of “orthogonal polynomials" would proceed by first determin- 
ing orthogonal functions ¢; of the variables 2; (see (1.2) above). The 
requirement is that ; 


(2.1) 8фф = Ships = Sé: = 0 


where the summations are taken over the N values of the independent 
variables 2;; which correspond to the data series on y; in a particular 
sample. The author has shown [6, Sec. 3] that a set of such orthogonal 
functions may be taken as 


(2.2) ф = 21, фз = 22 — Ma, фз = 23 — Аа - Аз 
where 
(2.3) М = аш/ап 


and A; and А, are defined by (1.6) above, for a single sample. 
The solution is sought in the form 


(2.4) u = tdi + lada + tabs ... 
and the normal equations for a single sample reduce to 
(2.5) tSo? = Shy, taI: = Soy, tSp? = Seay. 


An examination of the summation products Sø; and Sy shows that 
for a single sample . 


(2.6) So = бфу = Үз 
where gs and Y; are given by (1.10). This is proved in the general case 
№ [6], formulas (4.5), (4.6) and discussion of (3.2). 


Hence if one denotes by $;9 the orthogonal functions applicable to 
the sth sample, one can write 


(7) , Shy + 8ф®у® + Spy 
gs + qu? + gs 


Where the denominator involves only the independent variables zi. 
hus one can proceed to find the sampling variance of y as follows: 
he first order variation in y due to variations in y; is given by 
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(2.8) бү = [Sp Liy + 500000 + 8ф;®ву® ]/Q 


where for brevity Q denotes the denominator of (2.7). Making the usual 
assumption that the y; are independent with constant variance, 


E[(*] = Е[(20)°118(69)° + S($:0)* + 8(ф, 94/0: 
which by (2.6) reduces to 
E[(6y)*] = B[(6y)?]/2. 
Hence the estimated sampling variance of y is given by 
(2.9) Est. Variance of y = V/(gs + gs + qs‘) 


where V equals the sum of the squares of the residuals divided by | 
(NO--NO--NO —7). This last denominator represents the proper | 
number of degrees of freedom since it is the total number of elements 

in the three samples (see (1.1) above) decreased by the total number 
of coefficients determined in the fitting process. 

The above method can be applied equally well if different statistical 
weights are assigned to the observations on the dependent variable. The | 
only difference in procedure would be that the moments would be - 
weighted moments, and that one would deal with a weighted sum of the 
squares of the residuals. The formulas (2.1), (2.5) and (2.6) would in- | 
volve weighted sums of the orthogonal functions, but the determination 
of the explicit form of the orthogonal polynomials (2.2) would remain | 
unchanged since the weighting process would be included in the pr- | 
mary moments which determine M, A; and Аз. 

Problems of the type discussed in Sections I and II might very well | 
occur in many other fields of application of least squares’ theory. The 
essential characteristics of the problem are; A determination of a set 
of coefficients of fitted functions so that they separately give optimum 
fit to each particular sample; a determination of another coefficient 80 
that it gives optimum fit to the group of samples lumped together. 

The procedure can be further generalized to apply to two or mor | 
sets of coefficients, where there is more than one coefficient in the 
second set, and to cases where the grouping of samples is performe 
more than once. This generalization involves some changes in pro 
cedure and may be treated by the author in a later paper. 


ШІ. SOLUTION BY MEANS OF THE DOOLITTLE ALGORITHM 

{ The Doolittle algorithm supplemented by columns used for comput- 
ing the elements of the reciprocal matrix, furnishes an efficient means 
of computing the necessary quantities in the preceding problem. 
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We use a slight modification of Dwyer's notation and first write the 
moment matrix supplemented by the array for computing elements of 
the reciprocal matrix, for three variables. This is termed the reference 
matriz in Schedule I below. This is followed by the computation algo- 
rithm where for brevity Dwyer’s “secondary” subscripts have been 
omitted from the b’s and the g’s, and a zero subscript has been used to 
denote the dependent variable [2, pp. 99-118 and 170]. 


SCHEDULE I 
Reference Matriz 
Row y E 2а 2а а 23 23 
аю ац аз аз 1 0 0 
II азо аз ад 0 1 0 
ш аш аз 0 0 1 
Computation Algorithm 
SQ) =I аю аһ аз аз 1 0 0 
r Dio 1 bia bis — = - 
801) 920 922 ga zit 1 0 
Ir bzo 1 bes те "Rol = 
зап) өз БТА РЕЛЕЛІ 1 
ІШ! bio 1 es NE eh 


For the purpose of determining the solution in terms of orthogonal 
polynomials it is not necessary to carry the primed rows through the 
auxiliary part of the computational schedule. The orthogonal poly- 
nomials are given explicitly by the elements in the summation rows 
8(7), 8(11) and 8(111). Thus А 


(8.1) $! = а, $a = 22 — Ma, фз = z3 — Аа — Азға. 


Тһе multipliers which apply to these orthogonal functions are the b's 
of the first column. Thus the fitted function is 


(3.2) и = оф: + deode + Бзофз 
and for 
(8.3) и = Һа + haze + Ма 


the forward solution for the three variable fit is 
13) = by — МЫ» — Aibso 

(3.4) AG) $ — Asbso 
Қ = bao. 
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For two variables omit the third column 2 

It has been shown (6, formula (4.5) and discussion of (3.2)] that the g 
elements in the summation rows (not including the auxiliary schedule) 
are related to the orthogonal functions as follows: 


(3.5) gro = $уф, (е = zie, |. ge = SO,” 


where summations are taken over range of observations fitted. 
Thus from (2.6) 


(3.6) дз = Sos? = ga; ^ Ys = буф = 0%. 


Hence the solution sought, in terms of the notation of the Doolittle 
algorithm is 


^ so + 900) + goo 
— ga F gut? F gu 


where the g’s are determined from separate Doolittle algorithms carried 
through for each sample, and 


a) = Big) — М0) — yA, 
ВӘ = , by? — yA, в = 1,2,8 


give the first two coefficients for each sample, adjusted for y. 


(3.7) 


(3.8) 


IV. ADVANTAGES OF COMPUTING THE ORTHOGONAL FUNCTIONS 


We һауе seen in the last section that the explicit determination of & 
set of orthogonal functions inherent in the least squares' solution i$ 
possible by supplementing the array of moments used in the reference 
matrix of the Doolittle algorithm by the identity matrix (with diagonal 
elements equal to unity and all other elements equal to zero). This proc- 
ess is entirely general and will serve to determine a set of orthogonal 
functions for any number of independent variables or “predictors.” 

One may well ask why it is not desirable to complete the computation 
for the elements of the reciprocal matrix. In some cases this would 
probably be desirable, for example in a problem of linear correlation 
where the number of predictors is fixed [4]. However, in many cases 
one is experimenting with the question of what predictors are suitable; 

_ and hence a process by which new variables can be introduced of 
dropped with a minimum of computation is highly desirable. As We 
shall presently see, the computation of the coefficients of the orthogonal 
functions offers an efficient means for such a process of iteration 0! 
*build up." 
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We introduce here a general notation. The general problem is taken 
as that of fitting a linear function u of М observations on a set of n 
variables 2; (see (1.2) above) and the matrix of the normal equations is 
given by (1.3) with no superscripts since we deal only with one sample. 
The fitted function is written 


(4.1) Un = Biz, + 8:092 + 9з + +++ + 899, 


where the superscript here denotes the number of predictors used in the 
fitting process. Thus if only three are used, 


из = Biz, + 810923 + 692, 


The same function expressed in terms of the orthogonal functions is 
written as 


(62) Un = hhi + dada + ids +--+ + dads 


and in this case the solution for three variables is given by using the 
first three terms of the same series. í \ 

Denote by a; the coefficients which determine the orthogonal func- 
tions. Thus 


ф = 2 е 
фа = 25 — ода .. 
(4.3) фз = 23 — а 92 — ода 
bn = 2, — об-Әа — о" Эа — ++ алі" 02. 
From the above it follows that 
B= th — fray) =h P — г Ье) 
ауд BOS в trey = oe = tea 
Bra” = Lua m dna 
8,9 = tn 


Where the multipliers t; are the same as the b elements taken from the 
fist column of Schedule I. Using the abbreviated Dwyer notation, 
(4.5) | 


ty = Dio, te = bao, = ++, th = bno. 
We have seen in the last section that the coefficients of the orthog- 
onal functions for the first three variables are given by the elements in 


the “summation” rows of the auxiliary schedule of the Doolittle algo- 
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rithm. This is known to be true in the general case (6, Sec. 3]. Hence in 
terms of our general notation the elements in the summation rows of 
this auxiliary schedule appear as 
SCHEDULE II 
Auziliary to Doolittle Algorithm 


Row д 23 т Zn 
S(1) 1 0 0 . 0 
S(II) -о 1 0 ... 0 
(4.6) 5(Ш) - -аФ =a” 1 . 0 
S(n) =a) aD — Lon ... 1 


The relationships (4.4) to (4.6) offer precisely the “forward solution” 
suggested by Dwyer [2, p. 170]. In other words, if one uses only two 
columns in the array (4.4) the solution of the two variable problem is 
obtained. Addition of the third column gives the three variable solu- 
tion, and so forth. 

Tf one uses the fact that the ts are the multipliers of the orthogonal 
functions in the solution (4.2), one realizes that the Рз are statistically 
independent in the sense of having zero covariance over the range used 
in fitting. Assuming no errers in the predictors, the sampling variance 
weight of % is known to be the reciprocal of the antecedent gi (see 
Schedule 1). Hence the variance weight of any coefficient Ви) of the 
^ variable problem is given by 


(47) Variance Weight of Вь® = 1/0 + (аһ ӘУ) дыл +: 
+ (a4) 2/9. 
The advantage of this formula is that to find the variance weight of 


the coefficient 8,"-? of the same predictor, when another prediction 
variable is added, one merely adds the term 


(аһ) ылы. 
A similar formula can be set up Бу the researcher for the covariance of 
two coefficients. 

Аз more predictors are added, the variances of all coefficients ge 
are tabulated and partial correlations can easily be found from the 
following relation. Denote by S the sum of the squares of the residuals 
after fitting a linear function of n predictors. Let S" denote the sum 
of the squares of the residuals when the kth predictor гь is omitted № 
the fitting. The following equation then holds [1, p. 163]. 


(48)  S,0-9 = SO + (8,9)? divided by Variance Wt. of Bn» 
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From this the partial correlation of y relative to 2; for the n variable 
solution can be computed from the relation |8, p. 180] 


(4.9) 1- "о... = [S/S (М — n — 1)/(N — n). 


The above analysis does not explicitly involve the orthogonal func- 
tions and thus does not involve new theory. The results are obtained 
simply from a study of the relations (4.4) above and constitute an 
argument for the computation of Schedule II. 

An explicit result following from the fact that the computations sum- 
marized in Schedule II give the orthogonal functions is that in the case 
of the linear problem the orthogonal functions furnish а new set of 
predietors which are linear transforms of the original variables, and 
possess zero covariance over the range of observed values used in the 
fitting process. In this connection it should of course be recognized that 
these orthogonal functions depend upon the order in which the original 
variables have been considered. For example, if гь were taken as the 
first variable in the fitting process, we should have фу=гь. Hence any 
change in the order in which the predictors are considered, means а 
change in the corresponding set of orthogonal functions computed by 
the above method. 

Knowledge of the explicit nature of these “orthogonal predictors” 
may throw light upon the nature of the original variables used. Indeed, 
in researches in psychology and economics it has been found that the 
determination of such “principal components” offers a starting point 
for simplifying inherent structural relationships. 

Analysis by means of orthogonal functions and their relation to the 
Doolittle algorithm makes it much easier to generalize the type of prob- 
lem discussed in Sections 1-3«and offers a solution which can be carried 
out in a systematic manner, 5 

Perhaps the most obvious application is to problems of fitting of 
Polynomial curves which are to be used for purposes of forecasting. In 
such cases it is often desirable to estimate the variance of the fitted 
Polynomial at several points on the curve for the purpose of determin- 
mg à sampling confidence band. 

; For curves of any complexity, say of the third or fourth degree this is 
simpler to do in terms of the orthogonal functions implicitly used in the 
fitting Process, Recalling that the reciprocals of the gj elements in 
Schedule I constitute the variance weights of the multipliers % of the 
orthogonal functions, which are now orthogonal polynomials of a single 
Variable, say z; the estimated variance of the fitted polynomial ч at any 


Point 2; is approximated by (сі. [5], formula (21)) 
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(4.10) su? = V,[ói!/an 2/08 + +++ + 22/944] 
where 
У, = S/(N — n) 


applicable to a polynomial of (n— 1)st degree. 
With the orthogonal polynomials determined by Schedule II, where 
now 


а = 1, 2. = 2,2; = 25,2. = zm 


their values at a series of points can be systematically computed with 
considerable ease, and the resulting numerical values are then squared. 
There remains merely to divide each one of these n values by the corre- 
sponding 9 taken directly from the Doolittle algorithm. The sum 
gives the variance weight s,*/V,. If the degree of the polynomial is in- 
creased by one, the new variance weight involves simply the addition 
of the term L 


фан Inet ла 


to what has already been computed. 
The ‘other alternative is to find the covariances and variances of the 


coefficients 8,09 by completing the computation of the reciprocal ma- 4 


trix and apply them to squares and cross products of the original varia- 
bles at each point. This is not only a much more complicated process, 
but also it will have to be repeated from the beginning, if a polynomial 
of another degree be fitted. 


V. NUMERICAL EXAMPLE 


| As a numerical example we have used a problem presented by Snede- 
cor to illustrate the method of orthogonal. polynomials [8]. Data 816 
taken over 11 consecutive days on weight of chick embryos and a suc 
cession of polynomial curves to the fourth degree are fitted. We shall 
use the same data with the exception that we shall suppose that m- 
advertently no weight was recorded on one of the days (taken as the 
11th day which happens to be:at the central position). We shall use 8 
weighted fit with statistical weights equal to unity assigned to days 
when weights were recorded, and a statistical weight of zero assigned 10 
the 11th day when no weight was recorded. 


Й 


| 
| 
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Data to be Fitted 
Day: 6 7 8 9 10; 1H: 5712 13 14 15 16 
y: 29 52 79 125 181 — 425 738 1130 1882 2812 
z: 10 9 8 7 6 5 4 8 2 1 0 


This procedure enables one to use "factorial moments" in setting up 
the matrix of the normal equations. We assign values of z such that 
2-0 on the 16th day, at the extreme where y is large, and thus z— 10 
corresponds to the 6th day which is the first day that an observation 
was recorded. Thus there is no value of y at z =5, and statistical weight 
w=0 at z—5. 

We use Lipps factorial moments [7] which might be termed “re- 
duced” factorial moments ог “reduced cumulatives.” They represent 
successive cumulations from z = 10 towards 2 —0 where each successive 
cumulation omits the last subtotal in the previous cumulation, The 
moments of the weighted independent variable т are 


Sim Уи ве zv(1)-5. 
E >-(2)- 155, etc. 


In the case of the dependent variable the moments are 


Mo = У wy = 7.458, М,- x«(i) = 11.407, 


М, = Zef?) = 16.623, ete. 


Since the major interest is in the plotting of the curves, the matrix of 
the normal equations is set up to determine successive differences of the 
Polynomials at z —0. We fit i 


Us = 8,9 + Bx (25) + B, (2) + ке (5) -H a (^) 


а=1, а= (4), «-(5), «-(% a= (+). 


The moments involving the dependent variable in the matrix are then 


with 
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z 
аш = Dow NC И = Мұ, k-1,2, 5 


Тһе moments of the independent variable in the matrix are 


z z 
1225 5 ki21,2,-.,8 
н X» 7.) | 


Using the formula 


BN Oo E 
FG) Gs- 
I y" 


with rSs, the matrix elements аы are easily resolved in terms of the 
reduced factorial moments S;. For example 


di = Qa = Sis, аа = 48,4- 3055 + 60% + 355, ete. 


Substitution of the values of the reduced factorial moments S; yields 
the following matrix for the normal equations: 


y а 2 2а ” % 
7.453 10 50 155 320 457 
11.407 - 360 1270 2788 4133 
16.623 - — 4817 11054 16912 
22.705 == ES DAT 26234 41233 
25.687 el Ax; Даш E 66327 


Following the computational procedure outlined in Schedule I, with 
the identity matrix placed at the right of the normal equation matrix, Wè 
record in Schedule III the numerical values found in the “summation” 
and “primed” rows of the computational algorithm. 

The formula for the sum of the squares of the residuals, to be used in 
the forward solution is (cf. Schedule I and [1, p. 164]) 


89 = Ушу? — aub, — gab: — +++ — (иба 
= 18.509609 — 13.506125 = .003484 
and the variance of fit is 


Vs = .003484/5 = .000697. 
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The forward solution for finding the 8's from the relations (4.4) and _ 


(4.5) would be facilitated by a separate computational schedule which – 


we do not show here. This would be obtained from Schedule III by 
multiplying the elements bw in the first column corresponding to the 
primed rows by the figures which appear in the preceding “summation” 
rows obtained from the identity matrix (Cols. 8 to 12). If a long compu- 
tational sheet is used, this might be placed to the right of Schedule III 
to avoid copying the multipliers by. L 

One can summarize the numerical results as follows: The values of 
8:9 give the differences at z—0. They are 

us(0) = 2.822632, Aus = — .975823, А?и = .2979619 
Aus = — .0702948, Atus = .00947051. 


The orthogonal polynomials implicit in the solution defined by the ele- 
ments in the summation rows of the auxiliary part of Schedule III, are 


Е К ы 50. &-(2)- «97 


«-()- em 


84 e» (8 5(:) + (5.911765) (2) 
5 4 " 3 А 2 
— (5.60294)« + 2.68235 
which means that the quartic can be expressed as (see (4.5) above) 
Us = аф + taba + bipi + lipa + tsps. 


Subtotals obtained from the forward solutions of the В’з give the 


differences of the intermediate polynomial curves fitted to the data. 
These are 


Straight Line: u, = 1.920665, Aus = — .235073, 

Quadratic: ш = 2.574341, Диз = — .655294, Au; = .0933824, 

Cubic: 14 = 2.797229, Aus = — .922760, Atu = .2419744, 
Aus = — .03714802. 


Using (4.10) and the above computations the first approximation to 
the variance of the fitted quartic at any point z is given by 
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su? = (.000697) [0.1 + |9-() 12/110 + [4з(2) ]?/187 
+ [é«(2) ]?/(171.6) + [6s(2) ]2/(60.1446) ]. 


The above set of orthogonal polynomials will of course be valid for | 
any series of y data arranged at 10 equally spaced intervals with no 
value recorded at the central value of =, where z —0 at one extreme and 
®=10 at the other extreme. Because of the simple nature of the dis- 
tribution of the statistical weights over the scale of 2; it may be of in- 
terest to record the orthogonal polynomials more in detail. The exact 
values of the last three coefficients of ф are 


5.011765 — (100.5)/17, 5.60294 — (95.25)/17, 2.68235 — (45.6)/17. 


Since the weighting is symmetrical about the central value of 2, 2-5, 
the orthogonal polynomials will satisfy the usual symmetrical relations. 
Тһеу can be written as: 


ф = (®— 5), 2ф = (2 — 5)? = 11, 
бф, = (2 — 5)* — (17.8)(z — 5), 


443 1544.4 
249, = (x = 5 — — (x — 5)? à 
ф = (= — 5) 17 (= 4 + 17 
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ESTIMATION OF THE PARAMETERS OF TYPE III 
POPULATIONS FROM TRUNCATED SAMPLES 


Des Ra; 
University of Lucknow and Indian Statistical Institute, Calcutta 


The problem considered is the estimation of the mean and 
variance of Type III populations from singly and doubly 
truncated samples with known truncation points when the 
number of unmeasured observations is (i) unknown for each 
tail, (ii) known separately for each tail, (iii) known jointly for 
the two tails. For singly truncated samples, estimation of all 
three parameters has also been considered. Estimating equa- 
tions are obtained by the method of moments as well as by 
the method of maximum likelihood. Previous results for the 
normal and Type III populations are particular cases of the 
results obtained in this paper. Numerical examples are given. 


HE problem of estimating the mean and variance of normal popula- 
T from truncated samples has been considered by Pearson and 
Lee [10], Lee [9], Fisher [7] Stevens [12], Cochran [1], Hald [8], Cohen 
[2] and [4], and the author [5]. The corresponding problem for the Type 
ІП population, of which the normal population is a particular case, i$ 
considered by Cohen [2] for singly truncated samples when the number 
of unmeasured observations in the omitted portion is not known. The 
object of this paper is to study Cohen's problem when the number of 
unmeasured observations is known and to estimate the mean and vati- 
ance of Type III populations from doubly and singly truncated samples 
with known truncation points when the number of unmeasured ob- 
servations is (2) unknown for each tail, (#) known separately for each 
tail, (#4) known jointly for the two tails. 


I. THE PROBLEM 
The Type III population may be written as 


2 
а) Хой, а--15:<», 
аз 
where 
a3?) — 
(EN - АЕ m $c - B odes 
c c 


з “ 336 
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and 
c= (4/as) (4/a33) —1/26-41a3* [T (4/252) ie 


The mean, и, and the variance, о?, of this population are-unknown 
while in the first instance аҙ, the third standard moment is supposed to 
be known. Let то and 20” -20/--В be the left and right truncation 
points respectively. Let there be nı unmeasured observations in the 
tail to the left of xo’, ns unmeasured observations in the tail to the right 
of 20” and n, measured observations in the sample range. We shall 
estimate the mean и and the variance о? in the three cases: ($) when nı 
and тз are not known, (ii) when nı and mz are known, (iii) when 
тп: = N —n is known. 


П. ESTIMATION BY THE METHOD OF MOMENTS 
The truncated population may be written as 
(3) Тадж), OST SR, 
where 
(4/a5*)—1 


(4) Хе) = 41 x 3 (a? + < 16—019) +), 


and 


QS p MR " 
e с 


It is easy to see that f(z’) satisfies the differential equation 


оз + + 

LE ro. 2 (++: 
қа) ae В, 
(1+2) 2 


Case 6). To obtain the moments of the truncated population about 
20; We separate the variables in (6), multiply both sides of the resulting 
equation by д/" and integrate over the range of the truncated popula- 
tion. Putting r=0, 1, the first and second moments are given by 
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л Јозо да 


Е 
ш = e|- E+ 6(Zi — Ze) Ex 22) 


() {ъ= et ++ (2 2 e) (Zi — Zs) 


where | 
pt -1 | 

d Z = fle) [ Í. | ed : | 

a= geo fron]. 

and 

(9) fü ef} ud £s zi nero, 

(10) 8-14 es =. 

tae = 


(11) = E (к — а) то 


5 the rth moment of the sample of по measured observations about 


’, Equating the population and sample moments, the estimating | 


ns obtained are 


- e|- E + i — Za) — 2 5а) 
2-6: 


(12) ыша {i + + (2 = 0) (A — 2) 


С: 
Ur а 2 
These equations in 2% and o* сап be solved by Newton-Raphson 2 
Неге 2, and 2; can be calculated from Salvosa's tables [11]. Of course, 
и? is given by 
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(zo! — u*)/o* = ғ”, 
Case (ii). The first two moments of the left tail about 29’ are 


7 
т = Нео, 
т 


~) 
2 m 


while the first two moments of the right tail about ao’ are 


My = o|- к+(+= 35 n]. 
2 с/т 


(13) 


09 o; R 3 RNn 
5 ^ E ber кыы күш ы 
м ehe eros ШЕ езу) 
where 
1 ғ іс; 
m= "ser fa. 
(15) Tig —2/аз 


у: = ео f. Jeu]. 


Considering ті, та and М, Мз as the contributions per unit Observa- 
tion of the left and right tails towards the first and second moments 
respectively, the modified moments of the total sample of nit m-+n: 
observations about 20 may be written as 


(16) л” = (тт + пом + таМ1)/(т + Bo + т), 
va! = (nima + Nove + т М3) / (n + то + т). 
The moments of the complete population about 20’ are given by 


az i! = — Фт, 
mm 


Equating the population moments to the corresponding sample mo- 
ments, the estimating equations obtained are similar to (12) where 
Y, and У, are substituted for Zand Zs respectively. 

_Case (0%), The first two moments of the combined tail about zo’ are 
given by 


no 


R n 
(б^) p : x] 


|--с. c У-т 


— no 
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(1) g2 = o? fı Ф = (= E г) 


no 
X; — X. 
mr un 2) 


R о/о R fio 
иш (+) ож, 
++ (2+) } 


Xo qq) [ f. „лда i T ЕСЛЕ 


х= T ge f^ лоа+ f EXE 


The moments of the total sample of № observations about 20 are given 
by 


where 


(19) 


w = [mm + (М — n9g.]/N, 


(20) y = [nove TON – n9gs]/N. 


Equating the sample moments to the population moments (17), we. 


obtain estimating equations similar to (12), where X, and X; are sub- 
stituted for 2, and 2; respectively. 


^". 


III. MAXIMUM LIKELIHOOD ESTIMATION 
Case (4). The likelihood function of the sample may be written as 


K vU z " 
І, = =f soa] um (= yè e) 
с" [2 1 c 
oA ” (41a3!)-1 
Aib “(= + e) , 
і „2 \с 


, 
(22) аа ы gr SOR pne 
а 


where 


k — constant, 


, 
с 


and /() is given by (9). Taking logarithms of (21) and differentiating 
partially with respect to & and с and setting to zero, the equations ob- 
tained are 


(23) 
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where 
озү 2"! NT 
(24) s= b«2(7«2)] j 


and Zi, Zs, vı given by (8) and (11) respectively. From these we obtain 
the maximum likelihood equations 


ГЕ СЕСЕ 
2 c 


i-e enn 


No оз 2 1 


(25) 


These equations in £' and ё can be solved by Newton-Ralphson method. 
Case (ii). The likelihood function of the sample in this case may be 


put as 
e RU rp Lors 


"m € 1 (41o3*)—1 
(= Э) Ші» (+ 2n : 


1 с 1 


Proceeding as in case (i), the estimating equations obtained are similar 
to (25), where Y; and У» are substituted for Z; and Zs Tespectively. 
Case (iii). In this case the likelihood may be written as 


K E о N—n9 no fg 
= — — (1/3) (= + ) 
= [ LL. уда f КЛ RP. cmi 
E afa" 5 Шау-1 
Apa му, 


Proceeding as in case (2), the extimating equations obtained are similar 
to (25), where Х 1 and X; are substituted for Z; and Zz respectively. 


IV. THE SINGLY TRUNCATED CASE 


The results for the singly truncated case can now be obtained in 


Particular. In case truncation is оп the left only, x0’. 
Bo that 
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Е’ and E — o, Za RZ, X and RX4— 0; ; 
Zi ME", Х, КЕ’), where 


(41аз?)—1 


a; 24. "E 
TT p ( вм) 22) M \ 
-2/аҙ 2 


74 
(Е) = — ЖЕ): 
7 


(27) 


Making relevant substitutions from (27) in case (iii) of Section II, we 
obtain the following equations of estimation by moments, when n, the 
number of unmeasured observations in the left tail, is known. 


n = olat) — £], 


28 
a n= otto [2 - v] - o}. 
To solve these equations, we eliminate с to obtain 

Va 1 0 оз ] 
29 нв a MALUM A ым) 
pt ия Oe) = MIS Sy 2 + 


Corresponding to any #', the right-hand side of (29) can be evaluated 
from Solvosa’s tables of areas and ordinates of the standardised Type 


Ш function. Hence, for a given sample # is determined from (29). 
And then 


(30) e* = [е у е, 


wt = ду — iret. 


Again, substituting in case (4%) of Section III, the maximum likelihood 
equations for this case are 


1 
zt- 1)s - +) =0, 
2 оз 


No аз? 
(81) 
4 ~( 2 м 
cw ala аталы 
од? тооз? `` оз с 


It is easy to see that £'is given by 


| 
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НЕЕ ey 


(32) pe iy : 
= —- (E) = 0, 
оз 
while 
(33) ô = fe) — Е], 
id 
(34) й = 0! — ҒӘ. 


Tn case the number of unmeasured observations in the left tail is not 
known, we make relevant substitutions in case (7) of Section II and in 
case (2) of Section III and obtain equations by the method of moments 
and by the method of maximum likelihood respectively. The manner of 
solving these equations is the same as before. 


V. ESTIMATION WHEN аҙ IS NOT KNOWN 


We shall consider the singly truncated case when, 71, the number of 
unmeasured observations in the left tail is known. The third moment of 
the left tail about zo’ is given by 


(85) т. = Б -3p — FR JC Еа T г $e], 
and the third modified moment, уз’, of the total sample of no+nı ob- 
servations about хо” is given by 

va! = (тта + movi) (по + ти), 
While the third moment iit of the population about: 20’ is 
(36) и! = (аз — ЗЕ — 890. 


Equating the first three moments of the population to the correspond- 
(CS moments of the sample, the equations of estimation are 
and 


Ya = 0° 6 (os — Ё) 
(87) | 


+ [eo = e] pie руф 2]) 
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To solve these equations, we estimate с to obtain the following. 


Ve ; 1 0 оз Л 
за = Q(h'os)) = $E) — РЕ) mmn o m =] 
1 
[e«g) — £p 
fa +24 (as — n+ - ер 
P 2 
From these equations we obtain 2% and as* by Newton-Raphson method, 


In cases, however, when truncation is on the right, the equations of 
estimation are 


n = e[óTQ^) — ЕТ, 
n= {6 + G E eere? - 2 | | 


(88) — = P(b'a) = 
vy 


39 
(89) уз = 05 (ке --Е) 
Ге 

+ E (шеру асы + 2] [Фт(Е) — и], | 
where 
40 qim = 
(40) $ (% + 2 н), | 
and | 

TQ = 2h += Я а, 
(41) ds : 


© аз?) -1 
| f ( + EON " een] 
в 2 


For a given £', T(£^) can be calculated from Solvosa's Tables. The man- 
ner of solving these equations is the same as discussed above. 


VI. EXAMPLES INVOLVING SINGLE TRUNCATION 

(а) Method of moments. Тһе method of estimating the parameters in 
this case can be best illustrated by considering the example given by } 
Cohen (1950). This example was constructed by arbitrarily truncating 
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а sample distribution of the weights of 629 University of Washington 
Freshmen so that the record of observations less than 110.5 lbs. is 
missing. Tt is known that there are 15 observations in the omitted por- 
tion of the distribution. For the truncated distribution we have 


no = 614, n = 15, ao’ = 110.5, », = 82.6596, 
т = 1359.2793, vs = 67269.15. 
Case (I). œs їз known. In this case it is given that оз =0.6. The equa- 
tions in Е and c are 
32.6596 = c[0t(£) — &], 
1359.2793 = o?{0 + (3 — En [at — ЕТ. 
The equation to determine £' is 
E 9 
TORE ES =F 
For a first approximation for £', we read from Salvosa’s Tables the 
value of £' such that the area of the standardised Type III curve from 


=2/as to i' is 15/629=.023847. Then a first approximation is 
= — 1.68. Details of further computations for determining ¢’ are given 


1.27434 = Q(t’) = +3- d 


in the table below. oe 
TABLE 1 
COMPUTATIONS FOR Q(?) 
2-0. 
———— — 


S'p4ee QU) 


1 
ғ TQ fe хе) +(e’) We 2 


71.68  .023402 089266 3.814400 .003187 .570300 2.267888 1.31847 
175 .017741 072787 4.102756  .100280 .556205 2.814240 1.28740 
71.79 015003 .064209 4.279744 .104554 „543949 2.341848 1.27385 
ТІЛЕ — .015050 (066205 4.234470 (103448 .540084 2.884805 1.27715 
T1780 оз (06418 — 4.275153  .104442 544251 2.941182 1.27417 


Thus & = — 1.789. From (30), о* =17.775 and и* = 142.30. 
Case (IT). аз is not known. In this case, the equations їп аз and £' are 
1.27434 = Q(E, as), 1.98101 = P(E’, as). 


The third standard moment as obtained from the truncated sample 
May be used as a first approximation for оз. Thus, аз=.7 for a first ap- 
Proximation. A first approximation for # is obtained on the assumption 


346 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1953 


that the area of the Type III function from —2/as to £' is 15/629. This 
gives # = —1.62 from Salvosa’s Tables. Detailed computations in- 
volved in the various steps to solve the equations by Newton-Raphson 
method are given in the tables below. 


TABLE 2 
COMPUTATIONS OF Q(, a) AND P(t’, аз) 


" FONT PN " 
ay + T fe хе) Lo] E QU o) PH, а) 
а) @ (3) (4) (5) (6) (7) (8) (9) 
0.7 


-1.60 20153 +106606 4.076244  .099583 .608340 1.34910 
—1.80 .010334 .054564 5.280046  .128092  .541205 : 
-1.79 :010801 .056727 5.208613 .127246  .544210 1.27523 
—1.81 +009800 .052450 5.352041  .130750  .538236 1.92641 


ы 
E 
© 
5 


1.93773 


0.6 
-1.78 +015656 .066295 4.234479 .103448 546084 1.27715 1.04147 
-1.79 +015008 .064209 4.270744  .104554 .543949 1.93008 
—1.780 .015068 .064418 4.275153 .104442 544251 1.98117. 


m" 
NN 
Eg 
5% 


TABLE 3 
DIFFERENCES OF £' AND a; 


as f'fromQ( аз) £'from P(t', as) Difference 
0.7 —1.793 —1.806 .013 
0.6 —1.789 —1.789 -000 


We thus find that аз=0.6 and #' = — 1.789. Consequently, the esti- 
mates of и and с are the same as obtained in Case (I). The results ob- 
tained by Cohen (1950) for the case when the. numbers of unmeasured 
Observations is unknown, are shown in the table below. 


TABLE 4 
ESTIMATES OF THE PARAMETERS 


Param. Values from ^ Estimates obtained by Estimates obtained їп 


etas complete Cohen this paper | 
sample asknown a;unknown e;known аз unknown 
а 0.59 0.60 0.71 0.60 0.60 
e —1.805 —1.850.  —1.878 —1.789 —1.789 
с 19.59 17.43 17.28 17.775... 11.106. 


н 142.25. 142.75. 142.95 142.80 142.80 
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(b) Method of maximum likelihood. The method of obtaining the 
maximum likelihood estimates is best illustrated by taking a random 
sample from the Type III population with a3=0.6, и = 100.00, c= 10.00, 
truncated at 90.00. The random sample is 


122.88 94.92 96.06 103.93 96.15 
90.44 100.93 104.53 105.94 94.62 
100.41 90.88 100.67 93.46 103.43 
90.27 93.26 131.28 94.11 103.84 
109.99 92.39 90.22 90.49 91.57 
102.38 99.59 94.42 95.42 102.71 


For the sample selected, we have 
т = 30, m —5, и = 9.373, v= 172.74529, то = 90.00. 


On the assumption that the area under the standardised Type III 
curve with a3=0.6 from —2/o to £ is 5/35=.142857, a first approxi- 
mation for £' from Salvosa’s Tables is given by £' = 1.04. Beginning with 
this value we finally obtain the following table. 


TABLE 5 
COMPUTATION OF. R(¢') 

г 8 ве)“ 
—.75 30.916744 +.04939 
—.80 28.811959 — .15082 
—.76 29.880896 —.05288 


Hence, # = — 755 by interpoiation. ү 
From (33) and (34), 2--9.8198, д =97.4139. The estimates obtained 
by the method of moments are 


i" = — 75, а%- 98773, и* = 97.408. 
The two methods appear to give estimates close to each other. 


VII. AN EXAMPLE INVOLVING DOUBLE TRUNCATION 


.. The manner of solving the estimating equations in this case will be 
illustrated by drawing a random sample of size 30 from the Type III 
Population with parameters a;=0.6, и=100, с=10, truncated at 
20 =85 and 2; = 115. Only case (2) will be considered as the method is 
the same for other cases. 
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The sample drawn is 


99.937 101.172 112.459 109.973 99.467 
102.483 95.331 97.380 95.774 95.591 
95.441 102.363 99.116 86.741 96.028 
102.497 87.314 97.187 88.063 98.965 
102.807 86.637 94.561 100.306 110.087 
110.324 105.355 107.434 99.955 97.805 


For the sample selected, we have 
no = 80, vı = 14.118433, vı = 246.292945, R = 30. 
(a) Method of moments. The equations to be solved are 
30 
PUE, o) = o| = ea sou - Z) - = 2z] 
т 
— 14.118433 = 0, 
qu, o) = eie ee a amc - e. - Za) 
30 30 
---Ф Г T3 (s T =) — 246.292945 = 0. 
с с 
As а first approximation for о, we use the sample standard deviation 
8—6.853 and for # the corresponding quantity calculated from the 
sample, Viz., —7,/s— —2.06. After several steps we obtain the table 
given below. 


TABLE 6 
DIFFERENCES OF с AND ғ 


в # from P(t', в) £' from Q(£, о) Difference 
8.28729 —1.7960 и г 1.7968 4-.0003 
8.31025 —1.7917 —1.7913 — .0004 


DA ESSE Я SII И ЗАЛА ЫЙ Aa 


By interpolation we have, #* = — 1.7942, 0* —8.2971 and consequently 
u* =99.8867. 
(b) Method of maximum likelihood. The equations to be solved are 


(1°) |- Ра 32)(Z, — 2) — 34 22] — 14.118438 = 0, 
с 


(2°) 10111 У) [1 T3203 zb — 3.33333 + (2, — 22) = 0. 
с 


First approximations for с апі £' are the same as in (a). 
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After several steps we obtain the following table 


TABLE 7 
DIFFERENCES OF с AND ғ 


c £' from (1?) Е from (2°) Difference 
8.982036 —1.6739 —1.6752 +.0013 
8.902077 —1.6871 —1.6860 —.0011 


From this we have ¢=8.9387, # = — 1.6811, р = 100.0269. 

It may be noted that the two methods do not give the same esti- 
mates. These estimates, however, will be identical when as=0, as 
shown by the author [6]. 
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PROCEEDINGS 


AMERICAN STATISTICAL ASSOCIATION 
112ТН ANNUAL MEETING 


PALMER HOUSE, CHICAGO, ILLINOIS 
DECEMBER 28, 1952 


MINUTES OF THE ANNUAL BUSINESS MEETING 
The meeting was called to order by Aryness Joy Wickens, outgoing President 
of the Association. 
Report of the Committee on Elections 


R. J. Eggert, Chairman of the Committee on Elections, reported to the mem- 
bership the results of the balloting for officers for 1953. The following officers 
were elected: j 


President Elect Herbert Marshall 
Vice President Rensis Likert 
Directors (1953-55) Wilfrid J. Dixon 

| Margaret J. Hagood 
Representative at Large (1953-54) Kenneth Miller 


District Representatives 


Northeastern District 
Eastern District 
Southeastern District 
North Central District 
> Routh Central District 
Western District 


David Votaw, Jr. 
A. J. Jaffe 

Ezra Glaser 
William Madow 
John Stockton 
Harry Schwartz 


Nominations for District Representatives 

Mrs. Wickens presented to the membership some of the difficulties attached to 
the nomination of district representatives. She stated that the matter would be 
presented to the incoming Board and Council. However, she remarked that it 
was up to the membership to make these nominations and that heretofore the 
nominations were late in arriving at the Secretary’s office. Since this was a mat- 
ter of interest to the whole membership, Mrs, Wickens urged that these nomina- 
tions be forwarded to the Secretary’s office early in the year. Y 


The Report of the Board of Directors for 1952 


Morris Hansen read the report to the membership. It was moved that the 
report be received and approved. The motion was carried. 


Secretary-Treasurer’s Report for 1952 


Samuel Weiss read the report, of the Secretary-Treasurer to the membership: 
It was moved that the report be accepted and approved. The motion was carried. 


Coming Annual Meetings 


Mrs. Wickens asked Mr. Weiss to present to the membership the dates and 
places that had been decided upon for future annual meetings of the Association 
They are as follows: 
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Year Place Hotel Dae - 
1953 Washington, D. C. Shoreham December 27-30 
1954 Montreal Mount Royal September 10-13 
1954 Бап Francisco (This is a regional meeting—hotel and dates 
not yet chosen.) i 
1955 New York City Biltmore Late in December 
Resolutions 


Mrs. Wickens called upon Ralph Watkins to present the following resolutions 

to the membership: 

1. Resolution regarding the Program Committee. 

RESOLVED that the officers and members of the American Statistical 
Association express deep appreciation for the excellent program prepared 
by the members of the Program Committee under the leadership of Alfred 
Watson, Chairman, and Edward Bloom, Secretary. 

The resolution was approved. 

2. Resolution regarding local Arrangements Committee and Chicago Chapter. 
RESOLVED that the officers and members of the American Statistical 
Association express their profound appreciation to the Local Arrangements 
Committee under the Chairmanship of Wesley Mitchell and to all of the 
individual members of the Chicago Chapter for their outstanding work and 
hospitality in connection with the arrangements for the 112th annual meeting 
of the Association. 

The resolution was approved. . 

3. Resolution regarding the retirement of Sylvia Weyl. 

WHEREAS, Sylvia C. Weyl has ably and faithfully served the Association 
іп the capacity of Executive Assistant and Editor of The American Stat- 
istician, the former from 1944 to July 1952, and the latter from 1947 to De- 
cember 1952, and has now retired from these positions; 

THEREFORE, BE IT RESOLVED, that the members of the American 
Statistical Association in annual meeting assembled do hereby express great 
Appreciation of the outstanding service rendered by Mrs. Weyl during her 
terms of office, and hereby express our recognition of her valuable assistance 
80 competently performed over this long term of years. 

a 4 


The resolution was approved. 
Mrs. Wickens turned the chairmanship of the meeting over to Professor 
Cochran, the president of the Association for 1953. 


New Business 
Professor Cochran asked if there was any new business to be discussed. It was 
Moved that the membership consider a time other than Christmas week for the 


1956 annual meeting of the Association. The motion was carried. The meeting 
was adjourned, 


Report of the Board of Directors for 1952 


During the calendar year just ended, the Association continued its policy 
Sonet economy, added moderately to its surplus, registered a significant in- 
Crease in membership and expanded the scope of its manifold activities. 


of 
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Program Committee 


Organized around the theme, Applications and Uses of Statistics, the Annual 
Meeting was planned and directed by a Program Committee chaired by Alfred 
Watson with Edward Bloom as secretary. The section on Business and Economie 
Statistics under the chairmanship of Donald Riley, the section on the Training 
of Statisticians under Philip McCarthy, the Biometrics section under Alexander 
Mood, as well as the Committee on Statistics in the Physical Sciences under the 
chairmanship of John Tukey and the Committee on Statistics in the Social 
Sciences under Conrad Taeuber all contributed substantial sections of the Annual 
Meeting Program. 


Bureau of Mines Survey 


The Board is happy to report that the Survey of the Statistical Program of the 
Bureau of Mines under the able direction of J. E. Morton, advised by a commit- 
tee consisting of Raymond Bowman, Ralph Watkins and Clarence Long, was 
completed last summer. The Report has been submitted to the Bureau of Mines 
for its consideration and action. 


Other Committees 


The Association’s advisory committee to the Bureau of Labor Statistics was 
engaged during 1952 in the task of cooperating in the preparation of the new 
consumers’ price index. The committee to advise the Bureau of the Budget on 
statistical policy reports started its work during the calendar year. 


Business and Economic Statistics Section 


This Section drafted its charter last year, which was thereupon submitted to 
and accepted by the Board which modified it slightly and referred it back to the 
Section for its consideration. This Section took the leadership in holding two 
conferences during 1952, one in Illinois and one in Pennsylvania. The Penn- 
sylvania meeting was organized in conjunction with The University of Penn- 
sylvania, Wharton School of Business Administration on the topic, “Тһе Role 
of Statistics in Business Planning and Control.” The Middle Western meeting 
was organized with the University of Illinois around the topic, “Minimizing 
Business Risks." All institutional members of the Association and all members of 
the Section were invited. 


Tulsa Chapter 


The Association welcomes its newest chapter, organized in Tulsa, Oklahoma, 


in 1952. The proposed Constitution of the group has already been approved and 
its Charter granted. 


The American Statistician 


After having served over five years as Editor of The American Statistician, 
the Association’s news publication, Sylvia Weyl asked to be relieved from her 
assignment. Mrs, Weyl was the first editor of The American Statistician and from 
its inception in August of 1947 she made The American Statistician a medium 
for the discussion of non-theoretical matters of concern to the statistical pro 
fession, including the aims and status of the profession, discussions of statisti 


| 
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institutions and non-theoretical expository articles on current statistical activi- 
ties and projects. Under Mrs. Weyl’s editorship, a substantial number of articles 
dealing with the training of statisticians, the needs and activities of business 
statisticians and descriptions and discussions of practical aids to working statis- 
ticians have appeared. The publication has filled the void between the technical 
professional discussions of statistical theory which appear in our Journal and the 
simple listing of program notes and announcements which used to be carried in 
the Association's Bulletin. Mrs. Weyl did a remarkable job of producing а read- 
able and useful publication of a sort not previously undertaken by professional 
societies. 

An ad hoc subcommittee of the Committee on Publications was designated to 
search for a new editor. Pursuant to its recommendations, The American Statis- 
tician will be edited under the sponsorship of the Wharton School of Business 
of the University of Pennsylvania. Dr. Raymond Bowman has agreed to give 
the necessary guidance and supervision; Almarin Phillips of his staff will serve 
as Editor. 


JASA 


At the request of W. Allen Wallis, Editor, and at the recommendation of the 
Secretary, the Board approved an expanded budget for the Journal of the 
American Statistical Association during the year just ended. Size and content of 
the periodical were expanded, pages printed in 1952 were 20% greater than in 
1951. 


Constitution e 


The Committee on the Constitution submitted a revised version to the Board 
of Directors. The proposed revisions include a number of changes of a non- 
controversial character designed to permit greater ease and flexibility of opera- 
tions. After careful consideration, the Board approved the Committee’s pro- 
Dosals. The Constitution in its revised form is being submitted to the member- 
ship for approval. 


Liaison 


The Board is desirous of increasing membership participation in the direction 
of the Association's affairs. With this in view, it was decided to send minutes of 
all its meetings to the chairmen of sections and of standing committees, as well 
kd to the members of the Board and Council. In order to ensure clearer present- 
tion of the various opinions and viewpoints expressed at Board meetings, the 
Board at its last meeting decided to appoint a special rapporteur to prepare a 
Summary of its views which will supplement the formal minutes. 


Report of the Secretary-Treasurer for 1952 

During 1952, the Secretary’s office continued to devote its efforts to extremely 
careful management. Through rigorous economy the Association completed its 
third Successive year in the black. During 1952 the increase in the Association’s 
ae lus resulted mainly from strict financial control rather than from member- 

їр gains, 

By the end of 1951, it had become apparent that Association membership had 
Temained virtually static for a period of three to four years. Accordingly, the 
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Secretary’s Office undertook a special membership drive last year, circularizing - 
by letter a large number of. mathematicians, economists, agricultural statisticians 
and others whose professional activities brought them into close relationship 
with the work of the Association. This campaign was interrupted last summer, 
but will be resumed in 1953. It is encouraging to note that the Association has, 
for the first time since 1948, shown a significant growth in membership. 

The number of members at the end of 1951 was 4,356. During 1952 about 350 
members will be dropped because of resignation, death, or non-payment of dues. 
New members in 1952 totalled 666. Thus the Association starts the year 1953 
with 4,655 members. The net membership growth in 1952 was 300 members— 
the first net increase in 5 years. 

Expenses for 1952 were budgeted at $48,666. The actual expense for the year 
was $47,529.49—a drop of over $1,000. Although the increase in the cost of 
publications was about $300 over the 1952 budget, certain other expenses were 
kept under budget level by saving wherever possible. 

While the saving in any particular item was not large in itself, the combined 
total was just about sufficient to offset the rise in the cost of publications. The 
actual income for 1952 was $54,061. This is about $700 under the budget figure. 
The loss is primarily due to the decrease in the sales of back issues of the Journal. 
The actual net income for 1952 is $6,531 which provides for a substantial increase 
to the Association’s surplus, Thus the Association will start the year 1953 with 
$16,965 in surplus funds, 


Office Organization 


As a result of the resignation of Sylvia Weyl, who had been Executive Assistant 
to the Secretary since the beginning of 1944, it was necessary to reorganize the 
Secretary’s office. With the agreement of the liaison committee of the Board of 
Directors, Edgar Bisgyer was appointed to serve in a managerial capacity in the 
Secretary’s office. Mrs. Weyl, who in addition to her administrative work had 
been editor of The American Statistician since it was founded in 1947, agreed to 
continue to edit that publication until the Board of Directors and the Secretary 
were able to find a suitable person to assume the editorship, The transfer of edi- 
torial function from the Washington office to the University of Pennsylvania 


campus will thus be carried out in an orderly fashion and without loss of ef- 
` ficiency. у 


Financial Recommendations 


The Treasurer’s Report, which is printed separately, shows the strengthened 
financial position of the Association and emphasizes that 1952 was the third suc- 
cessive year of accruing surplus. In this connection, the Secretary has recom- 
mended to the Board of Directors that the Association plan to build up & surplus 
equivalent to one full year’s income before entering into any major expansion 
of its activities, It is proposed that approximately $3,000 be added to surplus 
each year until this objective is attained. With this in mind the budget for 1958 
has been ealculated to provide a net income of slightly over $3,000. The proposed 
income is budgeted at $50,650, while expense has been calculated at $47,012. 
While expense is caleulated as closely as possible, income is figured very con- 
servatively and actual income figure for 1953 шау be expected to be somewhat 
larger than that listed in the budget. : 
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To the Board of Directors of 
American Statistical Association. 


I have examined the attached financial statements of American Statistical 
Association relating to the year ended December 31, 1952. My examination 
was made in accordance with generally accepted auditing standards and, ac- 
cordingly, included such tests of the accounting records and such other auditing 
procedures as were considered necessary in the circumstances. 

Тһе recorded cash receipts for the year were traced to the deposits shown on 
the bank statements and the amounts for dues and subscriptions were tested 
with the membership and subscription records. The paid checks were inspected 
and related vouchers tested in support of cash disbursements for the year. The 
bank balances were reconciled with amounts reported directly to me by the 
depositories and the cash on hand at December 31, 1952, was verified by іпврес- 
tion. I did not check the membership and subscription records in detail or make 
any independent verification of the inventory of old Journals, the office records 
of which are based, in part, on data assembled in prior years. 

In accordance with a resolution of the Board of Directors, the expense incurred 
in publishing a directory, distributed to the membership in 1951, is being spread 
over a three-year period although such costs would appear to be applicable pri- 
"rM to the year 1951. The amount deferred at December 31, 1952, aggregated | 

831.86. 

In my opinion, the accompanying statements present fairly the position of 
American Statistical Association at December 31, 1952, and the results of its 
operations for the year, in conformity with generally accepted accounting prin- 
ciples applied on a basis consistent, except as mentioned in the preceding para- 


graph, with that of the preceding year. i 
James G. Jester 
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AMERICAN STATISTICAL ÁSSOCIATION 
BALANCE SHEET 


Assets 
December 31, 
1952 1951 
Cash in banks and on hand,..................... $39,503.14 $28,618.28 
Accounts receivable..... valetas A ee eese. 1,099.58 1,385.48 
Investment in United States Savings Bonds, 

Series С, due 1962, at cost.................. 3,100.00 3,100.00 
Inventory of old tous at approximate cost..... 1,909.85 1,625.85 
Inventory of monograph on Acceptance Sampling, 

at; costs. Doc ы Қа Avene а TLLA 233.93 452.54 
Inventory of emblems, at cost.............. А 463.50 588.00 
Furniture and fixtures, at cost less depreciation. . 2,192.42 2,026.99 
Deferred charges: 

Deferred Membership Directory expense.,...... 831.86 2,131.86 
Other. Мы Saeed ee НК ETE 1,007.68 723.48 


$50,342.15 $40,052.48 


$ 4,811.46 


$13,618.00 
4,379.60 
4,365.67 


$24,568.64 $22,363.27 


Net Worth: 
Life membership reserve. . $ 3,044.10 
Surplus, per statement, 10,433.05 


$19,843.13 $13,477.75 


$50,342.15 $40,652.48 
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AMERICAN STATISTICAL ÁSSOCIATION 
SrATEMENT OF INCOME AND SURPLUS Accounts 


Year ended December 81, 


1952 1951 
Income: 

Dues—current уеаг.......................... $37,101.00 $33,358.00 
Dues—prior year 184.00 400.00 
Life membership іпсоте...................... 166.81 84.40 
Subscriptions... 5.2 ИИО 9,971.35 9,433.15 
Advertising..-. и 1,594.47 2,251.13 
Journal sales, less cost of sales. Д 1,079.14 1,907.03 
American Statistician вмЇев.................... 82.65 158.65 
Acceptance Sampling monograph, less cost of 

АН eee Sn cs bre s elei e ЛАЙЛИ ШЫ ИНС 24.39 210.60 
Mailing list income. 5 704.17 683.34 
Biometrics sales... sa 2.2. Т ЛЕН 453.25 490.80 
(interest income. Л... sf. лл ERE DDR 541.37 395.40 
Reimbursement of overhead expense 

lof) Mines projeot.i/.2.2 5 0. 19 ЖЕЛ КОШО 2,000.00 1,500.00 
Miscellaneous 159.08 702.63 
ПО Income. ыы Ы EU E $54,061.18 $51,575.13 

. 
Expense: 

Journal—printing, mailing and reprints... ..... $12,858.57 $0,997.19 
Salaries and эарев........................... 14,716.81 14,817.96 


5,810.75 4,815.68 
1,800.00 1,233.50 
550.72 141.29 
2,400.00 2,400.00 
2,434.54 1,059.39 
1,163.45 1,911.95 
702.30 2,375.00 
563.71 482.11 
2,001.67 1,196.52 
555.87 510.86 
970.00 970.00 
1,501.10 1,947.46 


American ‘Statistician 


Total expense 


$47,529.49 $44,458.91 


$ 6,531.69 $ 7,116.22 


Excess of inco; 
ше over expense for the year. ......- 
Р у 3,317.43 


Add: Surplus account at beginning of year.......- 10,433.65 
Surplus account at end of year $16,965.34 $10,433.65 


BOOK REVIEWS 


` Wesley Clair Mitchell: The Economic Scientist. Arthur F. Burns, editor. New 
York: National Bureau of Economie Research, Ine., 1952. Pp. viii, 327. $4,00, 


See the article by Adolf A. Berle, pp. 169-175 in this issue. 


An Introduction to Scientific Research. E. Bright Wilson (Theodore William 
Richards Professor of Chemistry, Harvard University). New York: McGraw- 
Hill Book Company, 1952. Pp. v, 365. $6.50. 


E. Г. LEBRMANN, University of California (Berkeley) 


d pes volume has as its primary concern the experimental and observational 
aspects of science, and its main purpose is to provide a collection of gen- 
eral principles and specific methods applicable to the planning and analyzing 
of scientific work of this kind. Under these circumstances it seems quite 
natural (at least to а statistician) that the concept of randomness pervades 
the book, and that of the eight principal chapters, five are of an essentially. 
statistical character. Taken together, these form an introduction to the con- 
cepts and methods of modern statistics. i 

Any book covering such wide territory faces a twofold danger. It may get 
lost in generalities which are of no real use to the practicing scientist. Alter- 
natively, it may become a collection of rules and recipes, which are not only 
tedious but also likely to be misused if they do not grow out of an under- 
standing. of the basic concepts and are not set within a framework of theory 
that indicates the necessary limitations. The author manages to avoid this 
double pitfall by putting considerable emphasis on the conceptual side of the 
subject, and by developing the techniques as examples of the general ap- 
„proach rather than in the dogmatic way one so frequently encounters. 

In setting forth the most important statistical notions, such as those of 
hypothesis testing, confidence intervals, factorial experimentation, and ran- 
domization, the author uses the device of ürst introducing a concept only 
qualitatively in а natural experimental setting. An analytical development 
is usually deferred until a later chapter and frequently the treatment of more 
specific problems at a still later point provides an opportunity for reiteration 
of the fundamental principles and further clarification of the conceptual dif- 


ficulties, This gradual approach saves a reader unfamiliar with the subject | 


from being overwhelmed by so many new ideas. 

The development of the main statistical theories is carried out in Chapters 
4 (design of experiments, 33 pages), 8 (analysis of experimental data, 63 
pages) and 9, (errors of measurement, 45 pages), where numerous specific 
techniques are also given, such as testing and estimation in binomial, normal 
and Poisson distributions, testing for goodness of fit, analysis of variance; 
control charts. Many further methods are mentioned only briefly but always 
coneretely, as for example run tests, sequential probability ratio tests, and 
tolerance limits, к 
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Closely related to the above material is Chapter 7 (classification, sampling, 
and measurement, 16 pages) in which the concept of a sample is clarified, and 
scientific induction is discussed as the method of inference from a sample to 
the population from which it was drawn. A formal discussion of probability 
is postponed until Chapter 10 (probability, randomness, and logic, 22 pages), 
where it is defined in terms of a random sampling process and where the 
addition and multiplication rules are derived. The author here also discusses 
the problem of how to utilize prior information, and in this connection gives, 
a brief account of the subjective point of view as presented by Jeffreys. 

As the author states, the book is intended primarily for “students begin- 
ning research and for those more experienced research workers who wish an 
introduction to various topics which were not included in their training.” 
While a number of derivations are sketched that require differential argu- 
ments, such as those of the t- and x?-distributions, these are usually given at 
the end of a chapter, so as not to interrupt the main development, Conse- 
quently, the book has much to offer even to a reader not familiar with the 
calculus. Of great help are the many examples, some particularly interesting 
ones being taken from the physical sciences, and the detailed discussion of 
the many pitfalls that threaten the unwary. ) 

If fault must be found, the reviewer would have preferred a clearly stated 
general definition of mathematical expectation and variance to the one given 
here by implication. He also missed a statement of the additivity of expecta- 
lions and other results of this type, which one feels might have been included. 
in chapter ten, These criticisms, however, view the book as a text in.statistics, | 
which it is not meant to be. Also, they are concerned solely with the mathe- 
matical aspects of the theory. With regard to the meaning, intent and possi-. 
bilities of statistical methods, the reviewer knows of no clearer or, on its ` 
level, more useful account. т 

Finally, it should be mentioned that in addition to the five chapters re- 
Viewed here the book contains five chapters of a general character (Chapter 
1, choice and statement of a research problem, 9 pages; Chapter 2, searching 
the literature, 11 pages; Ghapter 3, elementary scientific method, 15 pages; 
Chapter 6, execution of experiments, 24 pages; Chapter 13, reporting the 
results, 10 pages); and three important technical chapters (Chapter 5, de- 
Sign of apparatus, 58 pages; Chapter 11, mathematical work, 29 pages; 
Chapter 12, numerical computations, 22 pages). 


Statistical Theory in Research. R. L. Anderson and T. A. Bancroft. New York: 
cGraw-Hill Book Company, 1952. Pp. xix, 399. $7.00. 


Сновснил, Еѕехнакт, National Bureau of Standards 


T volume treats in detail an amazingly large number of topics in statis- 
tical theory and methodology, with careful attention to the assumptions 
and mathematics upon which they are based, and illustrates their applica- 
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tion through numerous worked examples in the text, plus a large number 
of well chosen exercises at the end of each chapter. Each chapter has its own 
list of “References cited,” to which explicit reference is made at appropriate 
points in the text. These are well chosen, as are the supplemental items listed 
as “other reading.” Together they will enable the reader (or teacher) to gain 
(give) fuller information on topics treated incompletely, or only mentioned, 
in the text. 

Written to fill a need expressed by “many research workers" for “a conven- 
ient reference book on statistical theory pointed to research problems, which 
could be used in conjunction with their books on general statistical methods, 
experimental design, and survey sampling,” this book should be well received 
by students and research workers in the fields of agricultural, biological, and 
sociological research who (1) have a “good background in differential and in- 
tegral calculus” and (2) have had some first-hand experience with conducting 
and interpreting experiments in one or more of these fields. For these, it will 
serve to explain the mathematical bases of the various statistical techniques 
with which they have become acquainted through their experimental work, 
or from elementary “practical” courses in statistical method, or both. Stu- 
dents and research workers in the physical sciences and engineering, on the 
other hand, will find the material covered herein somewhat less familiar, and 
its scope somewhat less satisfactory—some statistical principles and tech- 
niques of importance in these fields are only touched upon briefly here, and 
others are not even mentioned. 

Viewedeas a textbook, this volume is really two books in one. Part I, en- 
titled “Basic Statistical Theory,” and the first four chapters (Chaps. 13-10) 
on regression analysis in Part II, are suitable for a one-year course in mathe- 
matical statistics taught, for example, in a department of mathematics or 
statistics, and can be understood by a student with the mathematical ma- 
turity of a mathematics major even though he may have had no first-hand 
experience with experimentation. These same chapters will be more difficult 
for the experimental scientist; here his»practical intuition born of experience 
will serve him less well. 1 

Part IT is entitled “Analysis of Experimental Models by Least Squares.” 
Experimental scientists in agriculture, biology, and social science who have 
had some experience with analysis of variance and modern experimental 
arrangements will be much more at home here in spite of the fact that the 
complexity of the algebraic manipulations involved will give them plenty 
of opportunity to exercise their mathematical skills. Here their intuition de- 
rived from experience will serve them well, and I believe that they will fully 
appreciate the import of the mathematical details even though they may 
be unable to reproduce them without aid from the book itself. On the other 
hand, the student who approaches the subject from the mathematical view- 
point cannot in my opinion fully appreciate the material presented in Part 
II until he has gained some practical experimental experience, no matter 
how adept he becomes at the mathematics. 
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The physical scientist and engineer without previous training or experi- 
ence in analysis of variance and the Fisherian principles and techniques of 
experiment design will be in much the same boat—they will have the mathe- 
matics but not the viewpoint needed to fully appreciate these chapters. Be- 
‘fore tackling Part II, they will do well to gain the requisite perspective by 
reading, for example, Chapters 4 (design of experiments) and 8 (analysis of 
experimental data) in An Introduction to Scientific Research, by E. Bright 
Wilson, Jr. (see preceding review). 

In the preface, the authors “welcome suggestions on methods of improv- 
ing this book both from a reference and from a text standpoint, without 
adding materially to the complexity of the material or the length of the 
book.” My own personal feeling is that a revision of this volume should not 
be attempted at all. Instead, Part II, of which Bancroft contributed the 
first four chapters and Anderson the remaining nine, should be split off from 
Part I, and issued separately as a monograph on “Analysis of Experiments by 
Least Squares, and the Method of Variance Components,” Part II is a very 
compact and lucid treatment of the mathematics underlying these distinctly 
different aspects of “analysis of variance and covariance;” is filled with up- 
to-the-minute information on these topics; and contains in the last chapter 
a useful “summary of needed research." This material forms a very useful 
supplement to technique books on analysis of variance and the design of ex- 
periments, and in the form of a separate volume could and should be kept 
up-to-date as such. Т 

Except for isolated sections, Part I is essentially a compact treatment of 
Statistical theory appropriate to measurement data (in contrast to enumera- 
tion data, and rank-order data). It might well be expanded a bit to make it a 
fairly comprehensive monograph on this phase of statistics. As it stands, it is 
8 very readable and yet compact coverage of the elementary theory of prob- 
ability, univariate and multivariate distributions, mathematical expectations 
and moments (including moment-generating functions and cumulants), trans- 
formation of variables and derivation pf sampling distributions of statistics, 
and orthogonal linear functions; together with point estimation (from the 
Fisherian viewpoint), interval estimation and tests of statistical hypotheses 
(from the Neyman and Pearson viewpoint). To round it out it needs a dis- 
SUSSIOn. of the concept and techniques of “statistical control” in relation to 
Measurement processes, a compact discussion of the various different 

Straight-line situations” of importance to the measurement analyst, some- 
5 on “the law of propagation of error”, and a bit on sequential analysis, 

It is my opinion that what is needed today is not more books that at- 
tempt to cover a wide (and for the most part the same) selections of ma- 
terial in detail, but rather, some books that provide a panoramic view of a 
im fraction of existing statistical theory and methodology, and others that 

comprehensive monographs on some single phase in which the author is 
Particularly well grounded, 
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Statistics for Sociologists. Revised Edition. М. ‘argaret Jarman Hagood and Dan- 
tel O. Price. New York: Henry Holt and Company, 1952. Pp. xii, 575. $5.75. 


А. W. MARSHALL, The Rand Corporation 


нз is a revised version of a book published in 1941 by the senior author, 

The revisions and additions include (1) new chapters on sampling in social 
surveys and on applications of factor or component analysis іп sociological 
research, (2) the introduction of the work of Guttman into the chapter on 
indexes and scales, (3) new, up to date reading references, (4) substitution об 
current illustrative materials in almost all examples, (5) reduction of space | 
devoted to computational procedures, (6) systematie collection of the most 
useful formulae hitherto scattered throughout the text at the end of the text 
and, (7) deletion of Part V of the earlier work, which contained material on 
birth and death rates and the construction of life tables. АП of these changes 
have undoubtedly added to the general usefulness of the book. 

This book represents an attempt to write an essentially non-mathematical 
text, for a group of students who would not understand a mathematical опе, 
in order to convey to them “all of the Базе statistical methods which have 
been used in sociological research, and some of the newer ones which have 
not as yet found wide application in sociological fields." 'This same aspiration. 
guided the preparation of the first edition 11 or 12 years ago. As then, books 
of this type for this particular audience raise very serious pedagogical prob- 
lems. This book probably succeeds as well as any book could in making some 
progress toward the above goals and so long as there exists a demand for 
texts which make such an attempt it should prove useful. 

The reviewer's private fantasy life on how to teach statistics to novices, ІП 
some substantive field such as sociology, contains two elements: 

(1) A good supply of first rate papers by the leaders in the substantive 
field of study which contain well thought out applications of statistics to im- 
portant problems. 

(2) A textbook of the general type being reviewed here; i.e., one that is 
essentially an introduction to statisti¢al tools of the trade, but with much 
more emphasis on first principles and the appropriate cautions to be ob- 
served in the use of the tools, In particular this model text would stress the 
logic of model construction and analysis of the structure of the population 
from which samples are assumed to be drawn. In addition it would have di- 
rections for the discovery of cases where sociologists or others ought to yell 
for help from mathematical statisticians, with perhaps some indications 88 
to what kinds of help are probably fortheoming. (Тһе importance of this lai- 
ter problem is very much overlooked. In the course of the reviewer's collabo 
ration with sociologists, when the problem they wished to deal with was com- 
pletely stated, the statistical problems involved were very frequently not 
textbook cases.) ы i 

This model text would be slanted toward the needs of those students with қ 
serious research ambitions and to that extent is not directly comparable wi 


> 
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the present work and no unfair comparisons should be indulged in. The de- 
mand and audience for such a model text (if indeed the above model text is 
truly optimal) has not really developed in sociology, nor is it possible to 
satisfy condition (1) above. The authors themselves “regret that the progress 
in the application of statistics to sociological research since 1941 has not 
provided a basis for a more complete substitution of new titles" for older ref- 
erences for supplementary reading. | 
Given their audience, the authors have written a book which is, if any- 
thing, a little above the average abilities of those who will read it, As the 
paragraphs immediately above suggest this is bias in the right direction. The 
book is very sound on those problems it does treat and with close reading 
no one should be given any wrong ideas about the application of statistics to 
sociological problems. ў 


Statistical Quality Control. Second Edition. Eugene L. Grant. New York: Mc- 
Graw-Hill Book Company, 1952. Pp. xvi, 557. $6.50. 


C. С. Crate, University of Michigan 


S two very good reviews of the first edition of this widely used textbook 
appeared in this Journal, Vol. 42 (1947), pp. 180-184, it is appropriate 
first to note the more important changes in the new edition. 

The most extensive additions are in the area of acceptance sampling. Since 
the appearance of the first edition, the Military Standard 105A acceptance 
Sampling tables for inspection by attributes have superseded the Army Serv- 
ice Forces tables. Professor Grant has drawn on the fine study he anti Lorber 
made of the new tables to replace the section on the older tables with a very 
5004 and thorough discussion of the new ones. The clear explanations of the 
use of the tables, of their relation to their predecessors, and of their more 
important characteristics should be welcomed by many present and poten- 
tial users. Another new feature of importance is the introduction of accept- 
ance sampling by variables. The use of known-sigma plans for one- and two- 
sided Specifications and of unknown-sifma plans for one-sided specifications 
18 explained and relevant tables are provided. The U and Q tests of Schwartz 
and Kaufman are also included as is Shainin's Lot Plot plan, the latter in a 
rather brief and non-committal fashion. The reviewer was disappointed in 
finding no critical discussion of Shainin’s widely used plan. Though group 
Sequential plans come in for repeated mention and are explained, the reviewer 
Vas also disappointed with the very brief treatment given item-by-item se- 
quential plans, An example is given of the calculation of an attributes plan 
with the very minimum of discussion. Surely the importance of sequential 
analysis for applications merits more attention. 

here is a good deal of revision in other sections of the book, too. The 
Section on the cost aspects of quality decisions has been somewhat expanded. 
l orthy of particular mention are: (1) the use of the exact value instead of the 
Arge sample approximation to the standard deviation of the sample stand- 
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ard deviation for small samples from a normal universe; (2) the addition of 
в brief discussion of the use of the normal frequency function as an approxi- 
mation to the binomial; (3) a modification in the handling of varying sample 
units on c charts; and (4) a discussion of the application of control charts to 
clerical work. These are-all improvements, But of greater importance is the 
fact that the new edition contains 302 problems as compared with 145 in the 
first edition. Perhaps the most difficult task in writing a book of this kind is 
to provide enough good problems; Grant deserves special commendation on 
this point. 

There are also points regarding which the reviewer feels mildly critical, 
First, he does not agree that one loses any significant figures when he sub- 
tracts the square of the mean from the mean of squares in calculating a 
variance. The text appears to give the impression that one sacrifices some- 
thing in return for the convenience of the “short-cut” method. Second, the 
reviewer would have liked to see considerable emphasis on the fact that set- 
ting limits on an X chart by use of an R calculated from R’s whose chart 
shows lack of control is at best only tentative. Third, because it is so hard to 
make the point really stick, he would have been pleased by some real ham- 
mering at the fact that deleting points out of limit lines to set revised limits 
is only paper work unless the related assignable causes are actually removed. 
Fourth, on page 98 the idea of a confidence interval is introduced, though 
not so designated, at first correctly, and then marred by “a less precise but 
more common interpretation” which incorporates the common first miscon- 
ception of this notion. In fact the reviewer believes that no book on statistical 
methods, even at the most elementary level, should fail to give considerable 
attention to this important and clarifying concept. Finally, on page 412 it 
seems that some attempt should have been made to explain the difference 
in origin between the sample standard deviation, as used elsewhere in the 
book, and the square root of the unbiased estimate of the variance and it 
probably should be stated that the latter is also a biased estimate of the 
standard deviation of a normal universe, ° 

Tt should be remembered that this is а pre-caleulus text and written for the 
student either in school or in a shop who has the very practical aim of mak- 
ing statistical quality control methods work. The reviewer has had a good 
deal of experience with the first edition and he is convinced that the second 
edition only keeps this book considerably the best in its field. As a text for 
engineering students who have had a year of the calculus one could wish that 
there were more relevant mathematical material and that, say, the first 
three chapters on X and R charts had been more concisely written. But the 
engineering point of view in this book is authentic and Grant keeps his feet 
firmly on the ground. This, in combination with its clear and вувіешайо 
exposition, its wealth of examples, and its numerous problems makes its 
merits far outweigh the few imperfections noted. 
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Statistical Theory with Engineering Applications. А. Hald. New York: John 
Wiley and Sons, Inc., 1952. Pp. xii, 783. $9.00. 


С. J. Ілевенмам, Stanford University 


CCORDING to the author, the aim of this book is “first and foremost to 
А furnish the reader with simple and practical methods—which сап be 
understood and applied by non-statisticians—for the handling of the ma- 
jority of the problems which occur in everyday work.” Hald has not only 
accomplished this, but, in addition, has provided the statistician with a ready 
reference book containing an abundance of interesting examples drawn from 
the engineering fields. Furthermore, this book will be useful as a supple- 
mentary text in many of the introductory courses in mathematical statistics, 
inasmuch as the treatment of the topics assumes a minimum knowledge in 
mathematics of the calculus. 

The book devotes chapters to the basic statistical topics including the 
calculus of probabilities, distribution theory, limit theorems, analysis of 
variance, design of experiments, regression, and correlation. Hald’s treat- 
ment of most of these topics is excellent. For example, for an elementary text, 
his axiomatic approach to probability theory is quite refreshing, His han- 
dling of degrees of freedom as related to linear restraints on the random 
variables is much more mature than that found in the usual texts, 

For the practical man, the oustanding achievement of the book is the very 
complete treatment of the usual standard techniques in the handling of data. 
For example, the chapter on Regression contains a section on solving the 
Normal Equations by the Doolittle method. Part of a chapter is devoted to 
criteria for rejection of outlying observations. Almost any topic from a treat- 
ment of the Central Limit Theorem to a section on Punched Cards can be 
found in the book, together with illustrative examples. From a practical 
point of view, Statistical Theory is far superior to anything that has been 
Written to date, 

The only criticism of the book is that it is somewhat incomplete from a 
еше point of view. Although Hald states that ^a general exposition of 

€ theory of statistics lies outside the scope of the present publication," the 
Teviewer feels that because the author presents a great deal of theoretical 
as well as applied statistics, certain topics should have been included. There 
Ж nothing said about either moment generating functions or characteristic 
functions, The sections devoted to statistical inference are very weak. The 
likelihood ratio test is never mentioned. Maximum likelihood is inadequately 
[есше as is the whole Neyman-Pearson theory. Perhaps this criticism is 

ue largely to the fact that Hald sueceeds in accomplishing more than he 
Originally set out to do, thereby suggesting that he could easily have accom- 
Plished still more, 

An interesting and most welcome feature found at the conclusion of many 
Р chapters is a section devoted to “Notes and References.” In this sec- 

n, Hald describes the history of the contents of the chapter, and acknowl- 
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edges the authorship of many of the proofs of the theorems presented. For 
example, at the end of the chapter entitled “Fundamental Calculus of Prob- | 
abilities," Hald writes: 


The calculus of probabilities was developed during the seventeenth cen- 
tury in connection with the solution of problems of games of chance. The 
classical definition of probability given in $1.4 took form in a correspondence | 
between Pascau and Fermar in 1654. On basis of this definition an exten- | 
sive mathematical theory was developed. Important contributions were | 
made by J. BERNOULLI in Ars Conjectandi, 1713, and P. S. pm LAPLACE, 
who in Théorie analytique des probabilités, 1812, gave a very extensive and 
systematic exposition of the results and methods of the calculus of proba- | 
bilities. . . . 

The construction of a calculus of probabilities from a set of axioms, 88 
done in $1.3 is analogous to the way in which other branches of applied 
mathematies have been built up. The first systematic exposition of this 
method and its consequences was given by А. Когмововоге: Foundations 


of the Theory of Probability, Chelsea Publishing Company, New York, 1990), | 


originally published in German in 1933. An elementary exposition of the 
modern mathematical concept of probability may be found in P. В. НА 
мов: The Foundations of Probability, Amer. Math. Monthly, 51, 1944, 493- 
510.... 


Ап elementary exposition of the different concepts of probability froma: | 


philosophical point of view can be found in E. Nacer: Principles of the The- 
ory of Probability, International Encyclopedia of Unified Science, Vol. № 
Хо. 6, Chicago, 1939. 


Published in conjunction with this book is a separate volume of statistical 
' tables and formulas. This supplement is one of the most extensive books of 
tables published to date. Included in this volume are tables of the normal 
distribution, the ‘distribution, the x*-distribution, the F-distribution, the 
distribution of the range, probits, confidence limits for the parameter of the 
binomial distribution, random numbers, eto. 
Not only is Statistical Theory a worthwhile addition to the family of books 
in statistics, but it is unique among books on statistics in that a quotation 
from George Bernard Shaw is used to illustrate a point. Hald states: 


Thus the concepts of stochastic and causal dependence must be carefully 
differentiated. Bernard Shaw in his characteristic manner illuminates e 
point in the following quotation from the section “Statistical Illusions" 0 
the preface to The Doctor's Dilemma: "Thus it is easy to prove that the 
wearing of tall hats and the earrying of umbrellas enlarges the chest, P 
longs life, and confers comparative immunity from disease; for the statistics ] 
shew that the classes which use these articles are bigger, healthier, and live 
longer than the class which never dreams of possessing such things. It does 
not take much perspicacity to see that what really makes this difference 
not the tall hat and the umbrella, but the wealth and nourishment of whic 
they are evidence, and that a gold watch or membership of a club in ee 
Mall might be proved in the same way to have the like sovereign virtues 
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Modern Elementary Statistics. John E. Freund. New York: Prentice-Hall, Inc., 
1952. Pp. x, 418. $5.50. 


Norman Rupy, Sacramento State College 


TTEMPTS to shift the emphasis in introductory statistics courses from de- 
А scriptive to inductive statistics have suffered from the absence of suita- 
ble textbooks and from the poor preparation in mathematics of most college 
students. The book reviewed here frankly emphasizes inductive statistics 
and, according to the preface and the publisher’s advertisements, requires 
only a minimum amount of mathematics. On this basis alone, the book de- 
serves the close attention of those concerned with the introductory statistics 


“course. 


The author states that “The order and the emphasis of the material cov- 
ered follows the modern trend in the teaching of statistics—to include in- 
formally topics that in the past have often been taught only on an advanced 
level." At other points, he emphasizes that the meanings of statistical ideas 
are of greater importance than the formulas. 

Despite certain criticisms to be made later, the reviewer’s over-all impres- 
sion is that the book will perform a valuable service in a general introductory 
course. The concept of a sampling distribution is very well developed and is 
emphasized whenever a statistic is introduced. Undoubtedly, the student 
who reads this book carefully will absorb what, many believe to be the single 
most important idea in a beginning course, namely that of variability in the 
possible outcomes of a sample. The discussion of confidence intervals and 
their interpretation is also very good, and is strengthened by the construc- 
tion of intervals for a non-normal distribution. The informal manner of 
presentation employed and the general tone of the book are consistent with 
what appears to be a current trend toward the humanizing of statistical 
ideas, comparable to the current humanism in the natural sciences. 

The adoption of an informal manner of presentation, which is virtually a 
necessity in the introductory course, places upon the author the responsibil- 
ity of providing enough exposition to insure that the meanings are clear. One 
of the weaknesses of the book lies in the fact that too often the meaning is 
subordinate to the formula, too often the elements of the formula are not 
justified heuristically or intuitively. A few examples will illustrate this criti- 
cism, 

Tn the section entitled “Some Rules of Probability,” the notions of con- 
ditional and joint probability are presented without benefit of any graphical 
or tabular aids, such as the two-way frequency table, found in the Һоокв Фу 
Wilks and Duncan. Again, in the discussion of confidence intervals for pop- 
ulation means, using small samples, the factor /n—1 appears without any 
mention of degrees of freedom. In fact, the concept of degrees of freedom is 
nor mentioned at any of the three places at which reference is made to the 

?" distribution. In yet another instance, the x? distribution is discussed in 
three different applications: first, as employed in the construction of confi- 
dence intervals for the population standard deviation; second, as a test for 
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association in contingency tables; and, third, as a test of goodness-of-fit, In 

the first application, the notion of degrees of freedom is not mentioned, while 

in the other two it is. There is no connection made between the two formulas 

given for x?, and no heuristic justification given either for squaring the dif- 

ference between observed and expected frequencies or for dividing the square 
- by the expected frequency. 

Тһе other general criticism which сап be made of the book concerns the 
omission of some ideas of modern statistics which are important and basic 
enough to be considered in an introductory course. In the discussion of hy- 
pothesis testing the reader is advised to avoid the type II error (accepting 
the null hypothesis when it is false) by never accepting the null hypothesis 
at all. By this approach, the author precludes any possible discussion of the 
theory of statistical decision,’ certainly as important a topic in the intro- 
ductory course as the use of runs in testing for randomness. (‘The latter gets 
eleven pages of text and two pages of tables.) The omission is especially dis- 
turbing because in each of the two examples used, the decision to reserve 
judgment on the null hypothesis is also a decision whose consequences should 
be considered. With the type II error eliminated from consideration, the 
selection of the level of significance (only two-sided tests are considered) is 
made by following “the customary rule” of .05. This advice is not at all dif- 
ferent in spirit from the 3c rule found in older textbooks. 

Another major omission, ir; the reviewer's judgment, is the complete ab- 
вепсе from the sections on correlation and regression of the concept of à 
"model" ðr of any indication of the importance of a priori considerations in 
scientific investigations. According to the text, the treatment accorded the 
data (linear or curvilinear fit) is determined by inspection of the data itself, 
and since curvilinear fitting is taken as being outside the scope of the book, 
there is no need to discuss criteria for choosing between linear and curvilinear 
regressions, In the discussion of the standard error of estimate (in which 
no mention is made of degrees of freedom), 15 is incidentally revealed that X 
and Y are assumed to be normally distributed [a bivariate normal distribu- 
tion?]. As usual, no mention is made of the fact that the prediction of Y on 
the basis of а given X is actually that of the average Y corresponding to the 
given value of X. 

A final comment, lest these criticisms lead the reader to overlook or dis- 
count the comments made in the third paragraph: The reviewer has adopted 
the book for а general introductory course. 


Elementary Statistics. Revised. Morris Myers Blair (Professor of Economics, 
University of Tulsa). New York: Henry Holt & Co., 1952. Pp. xiv, 735, $5.50. 
В. Cray бркоутз, University of California (Los Angeles) 


Wu reading the first chapter of this book, my pulse quickened in 80% 
ticipation of a “new” book in the field of elementary statistics. The 
author states: “Because life is so short that we cannot know all about any- 
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thing, we have to make our decisions on the basis of sample information. 
This is the statistical method;” and, “Statistics as a science in its broader 
meanings is not only а means of guiding our judgment in making the daily 
decisions of life, but also a device or tool for discovering new truths. It is one ` 
of the most powerful engines for research." From here to the end of the book 
my pulse was normal except for one small flutter on page 208 where the 
author writes: ^But analysis is usually not an end in itself. One analyzes in 
order that he may comprehend and forecast, infer, project, and apply his 
information for further practical or theoretieal use. The great end of sta- 
tistical description and analysis is inference." If the author really believes 
this, why did he not orient his textbook in this direction? 

Is it because, as he says, *He is convinced that statisties should first be 
taught to most students from the standpoint of preparing them to be con- 
sumers of statistics instead of creators of research?" What better way to 
teach students to be intelligent consumers (even of descriptive statistics) 
than to make them create one small piece of research in which they them- 
selves analyze a problem, specify the data needed, collect that data, make 
the necessary estimates, compute their errors, and then draw an inference? 

Or is it because he thinks of a very selective area of consumption? “Many 
farmers and small merchants read carefully the price quotations, deliveries 
of grains and livestock . .. tables and charts... in the daily newspapers. 
These people are consuming statistics.” I infgr from this that he wishes to 
teach people how to interpret statistical data—an admirable wish. These 
people and others are also continually faced with making decisions which 
are based upon incomplete information and, therefore, subject to uncertain 
outcome, Implicitly or explicitly, they deal in probabilities. What better 
place than in elementary statistics to contribute to the general education by 
introducing the statistical method with its emphasis upon choice among al- 
ternatives and the probability of being wrong in that choice? 

Granting that Blair is correct in appraising the consumer area, I believe 
that there are serious omissions in the text, I shall cite a few examples. In his 
discussion of the range (р2152) he does not mention the interpretative falla- 
cies inherent in the dependence of the magnitude of the range upon the 
Sample size. In the discussion of regression analysis there is no cautioning 
about the value of the intercept often being an extrapolated value subject to 
large error and often an impossible value in the particular problem. Spurious 
correlation, a serious pitfall for the consumer, is not mentioned; neither is the 
frequently recurring trap of the regression fallacy. On p. 236 it is stated that 
Tank correlation methods are not widely used. This Journal recently featured 
an article on the use of ranks which listed sixty-nine works in its bibliogra- 
phy; fifteen of these had the word rank in their title and others, I am sure, 
also discussed ranking methods. In the discussion of stratified sampling (p. 
370), there is nothing to indicate that the real gains from stratification occur 
When the conditions are such that individual strata are sampled dispropor- 
Honately to their size in the population. In the chapter on seasonal variation 
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(chapter 16) he gives three methods of calculating seasonal variation, but = 
nowhere do I find (a) an explanation of а seasonal index of, say, 120 fora 
partieular month; (b) any mention of adjusting data for seasonal variation; 
(c) an interpretation of a seasonally adjusted value. 

It seems to me that Blair has not only missed a consumer area to which 
statistics has much to contribute, but that he has also failed to write a book 
which is successfully consistent with his own original intent, both because of 
these omissions whieh I have mentioned and because he continually drags 
the student through a mass of computations (a function of the producer?) to 


the neglect of interpretation. In my judgment, this is just another textbook | 


in elementary statistics, It may be preferred to another because the first 
edition (1944) has been used in the past, because there are elaborate work- 
sheets which aid students (a valuable service), or because it may be a con- 
venient handbook. In other respects it is not distinguishable from a dozen 
or so other elementary textbooks in statistics. 


Statistical Methods for Social Workers. Wayne McMillen. Chicago: The Uni- 
versity of Chicago Press, 1952. Pp. xi, 564. $6.75. 


Daxner О. Price, University of North Carolina 


T stated aim of this book is to provide an orderly introduction to de- 
scriptive statistics for social workers (vii). In the opinion of this reviewer 
it misses the mark considerably. It is uneven in its treatment of material and 
contains Several serious statistical flaws. It deals almost entirely with the 
field of descriptive statistics and contains the statement, “Perhaps in the. 
long run inductive statistics will make contributions to human knowledge 
that clearly entitle it to be regarded as of far greater value to society than 
descriptive statistics” (vii, reviewer’s italics). The first five chapters deal 
with the collection, editing, tabulation, and presentation of data. 

Although the author seems dubious of the value of inductive statistics 40 
social workers, the two illustrations on page 161 are both examples of induc- 
tive statistics. а 

In Chapter IX on the Mean, Median, and Mode it is not made clear that 
the “long” and “short” methods of computing the Mean produce identical 
results with the exception of rounding errors. In fact one gets the impression 
that the “short” method is an approximation to the results of the “long” 
method with the statement, “Thus the two methods produced substantially 
the same result” (p. 234). 

Chapter X is а 17 page chapter on the Geometric Mean and Logarithms. 
Considering the doubtful value of the geometric mean to social workers this 
space might well have been utilized further in the book in the treatment of 
statistics of relationship, 

In the chapter on Measures of Absolute Variability the author confuses 
the binomial and normal distributions, actually using a binomial distribution 
as his illustration on page 270 and calling it а normal curve. The binomi! 


учее 
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distribution is not mentioned. In this same ‘chapter eight pages are devoted 


to the mean deviation, a measure of limited usefulness, and only six pages 
are spent on the standard deviation, one of the most basic statistical meas- 
ures. It is also in this same chapter that reference is made to Appendix P, a 
four page appendix on how to extract a square root, At no point is а reference 
made to Barlow's Tables of Squares, Cubes, Square Roots, Cube Roots, and 
Reciprocals. 

Chapter XIII deals with Ratio Background or semilogarithmie paper. This 
material really belongs in the early chapter on graphie presentation. 

Although the book deals primarily with descriptive statistics, Chapter XV 
is an 11 page chapter on Sampling which contains several errors sufficiently 
serious to deserve comment. No distinction is made between a systematic 
sample and а random sample, the formula for the estimated standard error - 
of the mean does not include the loss of a degree of freedom for use of the 
sample standard deviation instead of the universe standard deviation, the 


‚ t distribution is not even mentioned, and the probable error is uséd incor- 


rectly, On page 340 we find the statement, “The méan of the sample plus 
and minus the probable error defines the range within which half the sample 
means may be expected to fall.” Erroneous interpretations such as this are 
part of the reason for dropping the probable error from modern statistical 
usage. Confidence limits are not mentioned. 3 б 

In Chapter XVI on Time Series a "derivation" of the two normal equa- 
tions for getting the least Squares straight line is given. The only difficulty is | 
that what is given is not a derivation since the criterion of miniriizing the 
Sums of squares of deviations is not utilized and this characteristic of the 
least squares straight line is not mentioned until two pages later. 

Chapter XVII on Correlation and Contingency leaves so much to be de- 
sired that by itself it would make the book of very limited value even if there 
Were no shortcomings elsewhere in it, The only method given for computing 
8 correlation coefficient is by first determining the regression line, then com- 
puting Y, corresponding to each Y, finding the difference between Y and Y, 
and squaring these differences; then finding the difference between each Y 
and the mean of Y and squaring these differences, then from these two sets 
of squares of differences determining the standard error of estimate and the 
standard deviation, and utilizing these two measures to find the correlation, 
The procedure in itself would discourage most students from ever computing 
8 correlation coefficient. No real understanding of the correlation coefficient 
is shown and no test of statistical significance of a correlation is given. In fact 
the author Says, “Unfortunately, this cannot be done" (p. 386). His confusion 
Seems to lie in not distinguishing between statistical significance and practical 
importance, 

The author gives two illustrations of high correlation between time series 
and suggests that the high relationship observed is due merely to chance, 
Overlooking the fact that the correlation is due to the close relationship of 
each variable to a third variable, time. 
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` As a suggestion for simplifying the computation of a correlation coefficient, 
the author suggests using the Spearman rank correlation as an estimate of 
the actual correlation between two variables and shows how this is com- 
puted. He gives no indication of the differences in assumptions and in utility 
of the two types of correlation coefficients. 

As а whole the book has a point of view of statistics which was fairly gen- 
eral about 30 years ago. А tangible illustration of this is the fact that the 
book at no point assumes that the student has access to a calculating ma- 
chine or even mentions a caleulating machine. This is definitely not a modern 
statistics text and it is to be hoped that it will not set the tenor of statistical 
training among social workers. 


Punched Cards: Their Applications to Science and Industry. Robert S. Casey and 
James W. Perry, editors. New York: Reinhold Publishing Corporation, 1951. 
Pp. viii, 506. $10.00. 


Harry P. HanTKEMEIER, University of Missouri 


нів book reveals the fact that chemists are now facing the same problem 
"Тое confronted statisticians about 70 years аро. Statisticians trying 40 
handle by hand the large volume of data collected in the 1880 U. S. Census 
took over seven years to organize and make available the information ob- 
tained. Chemists working new on the Gmelin Handbuch der anorganischen 
Chemie think that it will take 10 to 12 years to complete the work and they 
hope to @omplete the 8th edition “about 1960." This work is taking во long 
that many chemists fear, and others are convinced, that the era of the classi- 
cal handbook is approaching its end. The parallel is striking. Statisticians in 
the ’80’s stated that the U. S. Census of 1890 would take more than 10 years 
to organize by hand and another census would have to be started before the 
information from the previous one could be made available, so some more 
rapid method of handling the data was imperative. This information was 
presented to some people who set about to invent and develop the Hollerith 
machines. Now, over half a century later, chemists are turning to the same 
punched-card machines to solve a similar problem—that of making available 
the pertinent information quickly, before it is made useless by the passage 
of time. 

An alternative to periodic handbooks is the mechanical information center. 
Various parts of this book point out that similar problems have arisen 11 
other fields, such as searching U. S. patent disclosures, hospital records, 
medical reports, library record cards, etc. Scientists have been forced to com 
sider mechanical methods of searching indexes, and this book contains 8n 
excellent and clear discussion of the difficulties of using standard punched- 
card machines for searching such indexes. Although codes may be devise! 
classify books by size, date of publication, number of pages, etc., 50 that 
when a book is assigned to one class it is excluded from all others, 8 code for 
subjects presents a different problem, for the assignment of a book or article 
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to one subject class does not exclude it from all other subject classes. To 
classify a book under two or more subject classes requires the use of two or 
more vertical card fields. This also usually requires that the indexer decide 
which is the major subject field and which are minor subject fields. A person 
searching for all books which deal with a given subject might want to examine 
a book even though it is indexed under a minor field. As the number of fields 
increases, the standard machine cannot search all of them at once, or on one 
run of the cards, for all books dealing with a certain subject, or search for all 
books dealing with a combination of subjects. 

A person, for example, who wants to locate all books dealing with the use 
of statistical methods to control the quality of a chemical compound used in 
a recently patented process to manufacture a new plastic would like to be 
able to locate all books dealing with this combination of subjects regardless 
of whether the book is primarily one on statistical analysis using this illustra- 
tion of quality control methods, or primarily a book on plastics which in- 
cludes a discussion of the fact that this particular plastic would not have been 
possible without uniform quality of a chemical compound obtained by using 
statistical methods of quality control, or even a book on patent laws that 
happened to contain an illustration of a court case arising out of ће use of 
this plastic and involving the presentation of evidence including quality 
control charts. It would be nice if the machine would sort separately cards for 
books involving all of the subjects desired, all but one, all but two, ete. This 
book points out the need for machines that will accomplish such mechanical 
searches on one run of the cards and probably stimulated the research that 
resulted in the perfection of the machine described in Library Applications of 
Punched Cards by Ralph Parker (American Library Association, Chicago, 
1952). The use of horizontal fields instead of vertical fields permits the ma- 
chine to search 12 fields in succession on one run of the cards. Each of the 
12 subject fields will accommodate a binary code number equivalent to 10 
decimal places in the Dewey decimal system. 

This book contains the case histories of many punched-card applications 
and the editors are to be cengratulated upon the completion of a considerable 
Project involving many people in widely scattered locations, “Тһе hand- 
Sorted edge-punched cards are discussed in greater detail than the machine- 
Sorted cards. In fact, one object has been to make the book serve as an oper- 
ating instruction manual for the edge-punched cards. It is not possible to do 
the same for machine-sorted cards within the scope of this book" (p. iii), 
The reviewer grants that this is the first and only book presenting detailed 
information on all hand-sorted edge-punched card systems, but he is also 
convinced that anyone adopting a hand-sorted system will find it to be only а 
temporary solution. Even for such prospective users the book may be dis- 
appointing because “there is no attempt to make critical comparisons" (p. 
39). In a book written by so many people in different locations it is very diffi- 
cult to avoid some duplication and a few statements that some may question. 
For example, “Тһе only known correct way to sample is by the use of random 


“ 
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numbers" (p. 412), and “Тһе mathematics of ‘goodness of fit’ have not уе} | 
been adequately investigated" (p. 413). HN 

This book is not very suitable as a college textbook but it should bein . 
most libraries to provide supplementary assignments. Many scientists, in- | 
formation specialists, executives, and office managers will find it useful ава 
‚ reference, 


Г 


Population, Food, and Economic Progress. Merrill К. Bennett. Houston, Texas: 
The Rice Institute, July 1952. Pp. 68. Paper. 


Corin Crank, The Bureau of Industry, Brisbane, Australia 


нів pamphlet is a triumph of common sense, Such a world do we live in, 
Та we are now beginning to regard common sense among learned men 
ав one of the rarer virtues. А very well known professor of theoretical eco- 
nomics was recently heard to grumble that men of common sense never 
looked at statistics, and that men who studied statistics always seemed to | 
leave their common sense behind them. ў 

Dr. Bennett’s work should now be well known, not only for his studies at 
the Food Research Institute of Stanford University, but also for his work in 
the field of international comparisons of levels of living, ог “real income.” 
(American Economic Review, 1951). We have not, and will not have for many 
years to come, accurate inforination about income levels in the under-de- 
veloped countries, and the United Nations statistics of average income per 
head in these countries are not (speaking with all restraint) what they appear 
to be. Dr. Bennett devises a most ingenious statistical technique for combin- 
ing all the scattered seraps of information available for most of these coun- 
tries, such as the number of automobiles, telephones, high school students, 
and so on. The present reviewer had set out to measure the incomes of some 
of these countries quite differently, and the check with the Bennett method 
proved to be extremely satisfactory; so he is attempting to extend its sphere 
of application to more countries and to more recent years. 

One of Dr. Bennett’s principal figures is taken from FAO publications: 
(which perhaps need no more qualification than they have received), namely 
calories of food consumption per head of population. He begins by reminding 
us of an obvious fact, the neglect of which has caused endless trouble— 
namely, that if statistics show that Europeans and North Americans con- 
sume 3,000 calories per head per day, and the inhabitants of southeast Asia 
2,000, the latter figure, per kilogram of average body weight, may be greater 
than the former. 

The next point which he makes is that the greater part of the world did 
and does obtain its calories from grains and roots, but that we all like to con- 

, Sume a substantial proportion of animal foodstuffs when we can afford them. 
In the past, this was the privilege of the wealthy on the'one hand, of nomadic 
huntsmen and herders on the other; for obvious reasons, the latter category 
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now represents a far smaller proportion of the earth's population than it did 
in the past. 

The next obvious fact which we tend to forget, and to which Dr. Bennett 
forcefully draws our attention, is that throughout the greater part of human 
history mankind must have had an inappreciable, if any, rate of population 
increase. In 1650, when modern population estimates begin, the world had a 
population of about five hundred millions. Biologists have contended that 
the human race has been in existence for five hundred thousand years, 
though some of them have recently raised the estimate to a million years— 
even if we assume that they have a couple of zeros out of place, we are still 
left with the same problem, Whether we assume that the human race started 
with two people, as Christians believe, or take any larger number predicated 
for us by biologists, and whatever conceivable date we take as the starting 
point of the human race, we are still left with the conclusion that any rate of 
population increase even remotely comparable with those which we know 
now would have produced a far larger world population by 1650. This accords 
with the observations of anthropologists that primitive people produce large 
families, but’that there is little or no net rate of population increase among 
them. Dr. Bennett has employed a team of historical students who follow 
world population data back to the year 1,000 A.D., and find plenty of evi- 
dence of stagnant or indeed declining populations, in all those historical 
periods when political order broke down. There is some evidence of heavy 
population declines, both in Europe and Asia, during the centuries of warfare 
and disorder between 300 and 800 A.D. В 

The reviewer would, however, disagree with Dr. Bennett/s results оп 
India, There, it appears, population rose to a maximum about the middle of 
the seventeenth century, under the despotic but ordered rule of the Mogul 
emperors; and in the ensuing two centuries of anarchy and warfare remained 
Stationary or declined until order was again restored under British rule in the · 
nineteenth century. 

The writings of Malthus, which have enjoyed such a comeback among 
American intellectuals inthe present generation, are thus found to be a 
theoretical speculation not valid historically—nor, for that matter, did Mal- 
thus make a valid estimate of the possibilities of agricultural development. 
Human capacity to reproduce is not limited. Demographic studies of primi- 
tive peoples, showing an average of six or seven children born to a woman 
who lives to the end of her reproductive period, agree well with the estimates | 
of the medical sub-commission of the recent British royal commission on 
Population, as to what would be the total average fertility of the modern 
English Woman, if she married early and imposed no restrictions on repro- 
duction; and also agree with recorded total fertility in some modern com- 
munities like Brazil, and sample areas in China. But under primitive condi- 
1008, or under the conditions of warfare and anarchy with which so much 
of the history of civilization is disfigured, this rate of reproduction will hardly 
allow the human race to maintain its numbers. Population increases only 
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oceur where mankind can establish a stable political order—a surprisingly 
difficult thing to do—and at the same time, accumulate sufficient medical 
knowledge to control the ravages of disease—a combination only found in 
the modern world, except perhaps for some periods under the ancient Greeks 
and Egyptians. 

And when such order and increase have been established, it is surprising 
how short is the time interval—as both modern experience and ancient his- 
tory teach us—before mankind, of its own accord, starts in various ways to 
reduce its rate of reproduction. We know our own nineteenth century his- 
tory; in Japan and Russia we can trace, with reasonable accuracy, the fall in 
the average total fertility from six to four over the short period of the last 
thirty or forty years; and very few people are aware that a similar trend is 
occurring in India, where the rate of population increase is rapidly decelerat- 
ing. 

In his final and magnificently common sense conclusion, therefore, Dr. 
Bennett sees “a touch of hysteria” in “current attitudes towards what is 
called the population problem,” or the fashionable idea that America ought 
to send what might be called missionaries of contraception to inform the be- 
nighted Asiatics. Even in the modern world, areas of excessive population 
density are rare, and most of the habitable world is still uninhabited. War 
and anarchy in the past have not arisen in over-populated regions—rather 
the reverse, among the people of sparsely settled grazing areas, who wished 
to live on the labor of others, rather than cultivate land which was available 
for settlement. Calculations of the remote future at which the world, at its 
present rate of increase, will become finally overcrowded, are of little interest 
—a purely arithmetic exercise, and about as sterile as the favorite old compu- 
tation of what would happen if a man left one thousand dollars to be in- 
vested at compound interest for the benefit of his descendant in six hundred 
years time, 


Census of Manufactures: 1947: Indexes of Production. Bureau of Census and 
Board of Governors of the Federal Reserve System. Т). S. Government Printing 
Office, 1952. Pp. viii, 99. $1.75. 


Paut B. Smpson, University of Oregon 


тре volume explains the material, methods, and results of computation of 
of an index of manufacturing based on the 1947 and 1939 Census of 
Manufactures. It extends the work of Solomon Fabricant: The Output of 
Manufacturing Industries, 1899-1937 (National Bureau of Economic Re 
search, New York, 1940), both in years covered and methods employed. The 
bulk of the work comprises detailed tables of quantity and value of output, 
industry indexes of output, employment, and weights. The text explains 
methods used, and the reasons therefor. 

The final index for 1947 relative to 1930 is 174, indicating that this eight 
year period witnessed a record percentage rate of manufacturing growth ЇЇ 
the United States. The percentage increase from 1921 to 1929 was somewhat 
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more, namely 88 as against 74, but this was accomplished during passage 
from a depressed year, 1921, to а peak year, 1929, whereas 1939 was a near 
record year of industrial production. Jim Corbett fans may be interested to 
know that in the ten-year period 1899 to 1909, manufacturing increased 56 
per cent. The growth since pre-war years has been large by nearly any stand- 
ard. 

The greatest task facing any index computor is obtaining data comparable 
over time, The Census and Federal Reserve Board collaborators have done a 
remarkable job in obtaining as much such data as they have. The work in- 
volved in such ап undertaking may be illustrated with а random quotation: 


Census figures for “other natural cheese” in 1947 represent “shipments and 
interplant transfers.” These figures are not comparable with 1939 “produc- 
tion” statistics, since they exclude natural cheese made into process cheese. 
++. Hence a United States Department of Agriculture quantity figure, 
which includes this production was used for 1947” (p. 77). 


It requires a special talent of patience, imagination, and devotion to ac- 
curacy to ferret out such detailed information. The statistical world and the 
nation owes a debt to those who have repeated the task a thousand times 
over in preparing this index and other economic series. The quotation also 
serves to remind us that pretentious economic categories such as supply, 
marginal efficiency of capital, and gross national product are compoundings 
of matters по more mysterious than cheese. ° 

The Census Bureau could confine its work to collecting information and 
turning it over to the public for analysis. The present volume is ample testi- 
mony that further stages of analysis are desirable. The comparison of 1947 
and 1939 data was accomplished better on home grounds than it could have 
been elsewhere, Indeed, this reviewer believes that there would be a net gain 
if the scope of the index construction activity had been expanded still further. 
Measurement cannot be separated from economic analysis, and the closest 
integration is desirable. Below are some examples of points where the com- 
putation of the index raises problems of general economie analysis. 

The index formula used is a cross-weight formula, the Marshall-Edgeworth 
formula using 1939 and 1947 value-added-per-unit weights. With simple 1939 
Weights, the index would have been 184, and with 1947 weights, the index 
Would have been 169. This difference indicates that quantity of production 
Increased most for those items whose prices increased least. In supply and 
demand terms, this says that for those items whose demand schedules in- 
creased the most, supply schedules increased proportionately more. If we 
look not at the total manufacturing index but at industry groups, a related 
tendency is manifest. The difference between group indexes based on the 
two Weights is greatest where production increased the most, the 1947 
Weights yielding a lower index. Thus leather products, whose physical pro- 

uction increased only 15 per cent from 1939 to 1947, had commensurate 
price increases, with the result that the two weight systems yielded the same 
index, On the other hand non-electrical machinery increased in production 
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177 per cent from 1939 to 1947 as measured by the 1939 weight index, and by 
160 per cent as measured by the 1947 weight index. The Spearman coefficient 
of rank correlation of 1939 indexes and the difference of 1947 and 1939 weight 
indexes for nineteen groups is .74. Assume that for those groups where growth 
was greatest, the dispersion of the amounts of growths of individual indus- 
tries is largest. This assumption, similar to one frequently made in stratified 
sampling, is reasonable here. Using this assumption, we reach the conclusion 
that within most groups of industries, as in the total manufacturing group, 
supply reacted proportionately more strongly to larger increases in demand, 

Let us suppose provisionally that the explanation of these facts lies in 
economies of large scale production. Such economies would presume down- 
ward sloping long-run supply curves, and would become operative exactly in 
those industries whose capital expansion was the greatest, and those would 
likely be industries where demands increased most. The observed facts would 
be explained nicely. We now have a new understanding of the different 
meanings of the indexes based on 1939 and 1947 weights respectively. The 
1939 weight index approximates the number of years it would have taken 
the 1939 industry, with its cost structure, to produce the 1947 output. The 
1947 weight index is the reciprocal of the fraction of a year that it would 
have taken the 1947 industry to produce the 1939 output. The cross weight 
index is a measure based on the operation of some intermediately adjusted 
produetive system. The problem of weighting methods has found solution in 
terms of economie meaning. 

Tt may be objected that the suggested explanation may not be the correct 
one, There are indeed other possible explanations of the response of supply to 
demand. Monopoly and wage rate effects are possibilities. Another із the 
observable fact that two groups of industries showing less than average 
growth in volume of output, namely petroleum and coal products and lumber 
and lumber products, are closely tied to natural resources. Were such Te 
source limitations the cause of the low output expansion and high price reat- 
(tions? To what extent do measures of manufacturing production reflect sui 
limitations? A discussion of the questions rising from theory and measure 

. ment would be very illuminating. 

Another aspect of the weighting problem is its relation to national income 
measures. Value added by manufacture is the basis for desirable weighting, 
because it eliminates duplication in value of commodities. However, from 
standpoint of the economy as a whole, duplication may still arise, becaus? 
taxes, advertising, insurance, professional services and the like are repre 
sented in value added by manufacture, though they have separate represen- 
tation in national income. The United Nations Statistical Office has recom- 
mended the use of weights excluding such economic activities. The writers 
the work in review have handled the problem only by noting that prepat® 
tion of such weights, others than those based on value added by manufac: 
ture, “was not found feasible” (p. 3). Surely the question of the position of f 
the manufactures index in all output deserves more consideration than 
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It is worth noting in this connection that the differences of national income 
and value added by manufacture may be just as important as changes іп 
value added by manufacture per unit of output in time. Total value added 
by manufacture in 1947 was $74,428 billions, of which 12.2 per cent or 
$9,053 billion was due to food production. National income originating in 
manufacturing, according to Department of Commerce estimates, was 
$59,459 billions in 1947 of which 9.8 per cent, or $5,822 billions, were al- 
located to food products. The differences demonstrate that this weighting 
question warrants careful analysis, Also it would be desirable to compare 
the production indexes with other value and price information than Census 

, data, which information might have been used for independent testing of the 
consistency of different source data. 

The definition of physical output as value of output at constant prices 
would work well if prices stayed constant, if the physical nature of goods 
remained unchanged, and if the definition is significant to economic analysis. 
All three, however, are questionable. Quality changes particularly embarrass 
the authors. They observe: “Technological change may make possible the 
substitution of less expensive materials without affecting the quality of the 
product” (p. 7), and again, “The (quality) changes are difficult if not im- 
possible to measure quantitatively, and the failure to reflect them in the 
present index, results in a downward bias” (p. 7). These two statements 
imply that a product is defined as a capacity for doing a particular job in 
production functions and consumer preference functions. If this is to be our 
definition of output, let us have it in the open and apply it the best we may. 
(It will lead to surprising results in many cases such as rubber tires, and 
nylon socks.) It also appears at conflict with a weight selection which reflects 
cost conditions in different industries. Economic theory has various defini- 
tions of output. These should not be ignored. 

Questions of economic analysis arise in connection with the otherwise ex- 
cellent studies of representation of the “missing” industries, that is, in- 
dustries for which physical production data of any sort are largely missing, 
covering about one-fourth of all industries in terms of value added by 
manufacture. The selection of a method for measuring output in these in- 
dustries was based on studies of what method would haye worked best in 
industries where quantity data are available, if they had not been available. 
A critical test was to determine whieh of three artificially constructed indexes 
came closest to the originally computed index. The assumptions used in 
constructing these artificial indexes were: (1) temporal change in value of 
Output per unit of physical product was similar among related industries, 
(2) temporal change in value added by manufacture per unit of physical 
product was similar among related industries, and (3) the temporal change, 
in output per man employed was similar among industries. The winner in 
this interesting contest was number three, which yielded the highest index 
for 1947 by some six points, or three per cent in the case of the output as- 
Sumption (1), and by half as much for the value added assumption (2). 
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Economic theory poses some questions about this selection of method; The 
"missing" industries are particularly prominent in machinery industries, 
fabricated metal products, and miscellaneous industries, which are industries 
whose production has grown particularly fast. This fact suggests a question 
of the following kind: Is the price behavior of these industries to be ex- 
plained by inefficiencies arising from dynamic change, by increasing effi- 
ciencies due to economies of scale, or by wage rate developments? So far as 
we can judge by price changes, the answer would appear to be that the 
second possibility is most likely, since generally price rises were less in the 
more expanding industries. This suggests, in turn, that output per man rose 
more in the “missing” industries than in others, suggesting, in turn, that the 
output per man selection may have understated the index, though by less 
than the other two alternatives considered. A discussion of these theoretical 
questions would be illuminating both to economic analysis and to measure- 
ment, 

The reviewer believes that theory or explanation is inseparable from 
measurement. At the moment, measurement is ahead of theory in the sense 
that we have measures such as the index of manufacturing whose significance 
is not fully understood because we do not know the underlying forces ac- 
counting for the particular observed changes. Measurement would be im- 
proved if answers were sought to specific questions of the following nature: 

(1) How long would it have taken the productive mechanism of 1989 
(say) to produce the output of 1947, and conversely with interchange of 
dates? - 

(2) To what extent was the change in output accomplished by changes in 
employment, changes in technology, and economies of scale? 

(8) What was the output of consumers? goods, in the sense of selling values 
at constant prices with allowance for quality changes and in the sense of 
production requirements? What was the investment output in the sense 
of production requirements and in the sense of capacity for new produetion? 

(4) What are the indexes of output.and employment in national income 
and gross national product senses? 5 

(5) How well do the indexes reconcile with price, value of shipment, 18 
tional income and other independent data? 

These questions cannot be answered easily or quickly, but beginnings ca? 
be made at once. Perhaps a committee of learned societies is in order for the 
purpose of making recommendations regarding expanded activities of g0V- 
ernment statistical workers. 


Washington State Statistical Abstract. Marilyn Druck Robinson. Seattle, Wash- 
ington: University of Washington Press, 1952. Pp. xi, 159. $4.50. Paper. 


PauL B. бімрвом, University of Oregon 
TS volume brings together a large quantity of data, covering PE 
cipally area, population, employment, payrolls, production, Е 
transportation, income, prices, and finance. The state of Washington 0% 
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urally receives the most attention, but comparative data for the United 
States and for western states are frequently presented. 

Although this volume uses data gleaned from census and other national 
Sources as its chief stock in trade, it is far from а mere compendium of such 
material. The reviewer estimates that at least one-fourth of the 104 tables 
contain data not available from standard sources. Examples of information 
hard to come by are trade and income statistics based on Washington tax 
information, indexes of business activity in the Northwest, housing- and 
rental-survey information, family income-survey information, Washington 
data about utilities and motor carriers, and lumber shipments reported by 
the Pacific Lumber Inspection Bureau. Other tables give inter-census inter- 
polations and county and regional break-downs of special information not 
available in standard sources. Payrolls, births and deaths, population, labor 
force, income, and the mineral statistics are examples. 

The work is carefully documented as to sources, nature of information, and 
additional sources, and includes warnings about misusing the data. By care- 
ful study the researcher can obtain a fairly complete picture of what regional 
information is available from all sources. There can never be too much ex- 
planation, however. In using the volume this reviewer found himself wishing 
that it included a summary of the idiosyncrasies of the Washington State 
retail and industrial income tax laws and of covered employment under the 
Social Security Program. A separate list of Washington State reports and 
local source materials would also have beer? useful to give the research 
worker a handy reference to sources of local data. 2 

Very few similar volumes exist for other states. The Universities of Ala- 
bama and of Mississippi have compiled statistical abstracts for their states, 
and а certain amount of statistical information is included in the state blue 
books, Generally, regional data are difficult to locate and to obtain. Marilyn 
Robinson and the other members of the Bureau of Business Research of the 
University of Washington who prepared the abstract have made a valuable 
contribution to regional research. 
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MEASURING THE ACCURACY AND STRUCTURE 
OF BUSINESSMEN’S EXPECTATIONS 
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N 1948 an extensive project on the relationship between business- 

men’s expectations and business fluctuations was begun-under the 
joint sponsorship of the Merrill Foundation for the Advancement of 
Financial Knowledge and the University of Illinois. This project has 
been conducted under the direction of Franco Modigliani, who made 
important contributions to the present study and to the other sub- 
projects making up the program of research as a whole. As part of this 
program, an analysis was undertaken of the accuracy and structure of 
railroad shippers’ forecasts, probably the only continuous set of data 
on economic expectations in existence extending back quarterly to the 
1920's and relating to individual industries and regions. А number of 
Statistical problems arose in the course of analyzing these data, the 
treatment of which would seem to be of broad interest to economie stat- 
isticians and others working with time series data. It is the purpose of 
this article, therefore, tó present these problems and the methods used 
10 solve them. Some of the results of the study are also presented in the 
course of this exposition. 


1. THE DATA 
The firms that account for the great bulk of railway shipments are 


members of the National Association of Shippers Advisory Boards. 
This organization has some 25,000 members. It is affiliated with the 


! The full results are being published in bulletin form by the Bureau of Economic and Business 
Research of the University of Illinois; The Railroad Shippers’ Forecasts, a monograph, a Study in Busi- 
hess Expectations and Planning, by Robert Ferber. 

At this point, the writer would like to acknowledge the valuable assistance provided by several 
fin members of the Merrill Project in this study. Special thanks are due to Jack J. Feldman for his con- 
abutions to the analytical framework employed and to Mary Lou Walling and Jean Rogers for the 

Stious statistical services they performed. 
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Association of American Railroads, and its primary function is to con: 
sult with, and advise, the AAR on shipping problems and transporta- 
tion conditions. To this end, the National Association is subdivided 
into thirteen regional boards, and within each board, into about 32 
major commodity groups and, where particular individual commodi- 
ties are of local importance, into commodity subgroups. The composi- 
tion of these regions and of the major commodity groups has remained 
remarkably stable since 1927. 

The forecasting procedure begins with the transmittal of requests 
to the shippers about the middle of a quarter for estimates of their 
freight car requirements in the next quarter. The forecasts are com- 
piled about two or three weeks later, usually by the secretary of the 
regional board, who is an employee of the AAR and is not a shipper; 
in one region the forecasts are compiled by the chairmen of the com- 
modity groups. Whoever compiles the data computes the expected per- 
centage increase or decrease in the next quarter’s shipments for each 
commodity group and for the region, as compared with the group’s 
shipments in the corresponding quarter of the previous year. At this 
stage, the commodity group chairman has the right to modify his 
group's forecast if he so chooses, but actually such changes are appar 
ently made only when some major development suddenly occurs which 
the shippers may not have anticipated, such as a labor stoppage. 

Though not designed on any probability sampling scheme, these 
data appear to be representative on the whole. Response rate of the 
members for any one commodity group may vary anywhere between 
25 and 80 percent. However, since special emphasis is placed on se 
curing replies from the larger shippers, the coverage in terms of ship- 
ments is much higher and often almost equivalent to that of а com- 

- plete census. The traffic managers, the officials who prepare the fore- 
casts, evidently do so carefully, and no evidence was found of any АЁ 
tempt to modify the forecasts so as to conceal information from com- 
petitors—that is, commodity-group chairmen—or to pad the forecasts 
80 as to ensure having enough cars on hand. 


2. OBJECTIVES 
In the main, the analysis of these data had the following threefold 
objective: 
1. To measure the accuracy of the forecasts. ; 
2. To test hypotheses concerning the structure of the forecasts, i6 
whether the forecasts can be explained by events that happened in the 
past. 
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3. To determine whether some transformation of the forecasts can 
be used to improve the accuracy of the forecasts as such. 
Each of these points will be elaborated upon in later sections. 


8. THE MEASUREMENT OF ACCURACY 


There are two respects in which the aceuracy of any forecasts can be 
evaluated. First we may ask: How close do the forecasts come to what 
actually happened? In other words, what is the margin of error in the 
shippers’ forecasts? Answering these questions, however, provides little 
information on the practical value of the forecasts. To determine the 
latter, we must ask: How does the error of the shippers' forecasts com- 
pare with the error that would have been obtained by using some alter- 
native, readily available forecasting method? In effect, answering this 
question constitutes a test of the relative accuracy of the forecasts. No 
matter how close the forecasts may be to actual shipments, they cannot 
be of much practical value if some other simple forecasting procedure 
proves even more accurate. And conversely, the shippers’ forecasts may 
be of practical value in the sense of coming closer to actual shipments 
than alternative forecasting procedures, and still be considerably in 


error. = 


8.1 Accuracy of Level Forecasts : 


A comparison between the levels of actual and forecasted carload- 
ings is shown in Chart 1. The upper panel refers to all carloadings and 
the lower panel to products of manufactures and mines only; agricul- 
tural commodities were excluded from the study. This panel indicates 
that the shippers’ forecasts tend to be too low in the upswings and too 
high in the downswings. In other words, the forecasts seem to lag behind 
actual events. Thus, the forecasts overestimated actual shipments all 
through the 1929-32 and 1937-88 contractions and even in the rela- 
tively mild recessions of late 1946 and 1948. The reverse was generally 
true for upswings. In fact, in a number of instances the forecast for the 
Current quarter, denoted by the letter t, appears to be essentially the 
actual value for the corresponding quarter in the preceding year, (—4. 
For example, actual carloadings hit bottom in the second quarter of 
1982, but the forecasts did not reach their low point until the second 
quarter of 1933, f 

Although this chart portrays the general trend of the forecasts in 
Telation to actual shipments, it does not provide a precise indication of 
the accuracy of the forecasts. Such an indication may be obtained by 
Studying the ratio of expected shipments to actual shipments, a ratio 
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Снлвт 1. Forecasted and Actual Carloadings, 1927-1949. 


denoted here by E;/A,, or, where по lags are involved, by E/A. Table 
1 presents two statistics based on this ratio. One is a measure of over-all 
accuracy, the average of the absolute relative errors of the forecasts, 
іе, Х(Е/А)- 1| /N. Thevalues for this statistic are presented for total 
carloadings and for nonfarm carloadings, excluding 1942-45, broken 
down by prewar and postwar periods and by the immediate trend of 
carloadings. Coal and ore shipments have been excluded from the non- 
farm total in this part of the analysis because of their disproportion- 
ately high weight in the total and because of the extent to which ship- 
ments of these commodities are influenced by such factors as labor dif- 
ficulties and weather conditions. Three trends are distinguished: rising, 
level, and falling. Since the analysis is based on seasonally unadjusted 
data, each type of trend is defined with reference to the ratio А,/ A 
i.e., actual shipments in the current quarter divided by shipments in the 
corresponding quarter of the preceding year. A particular quarter 18 
said to exhibit a falling trend if A/A is less than .95, a level trend if 
a is from .95 to 1.05 inclusive, or a rising trend if А,/А,-4 exceeds 
.05. 

A measure of the dispersion of the estimates is presented in the last 

column of the table, which contains the values of the standard devis- 
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TABLE 1 
ACCURACY OF CARLOADINGS ESTIMATES, 1927-41, 1946-50 


Standard 
à Average absolute percent of error deviation 
27 Rising ^ Level Fallin АП ве 
E = about 1 
Nonfarm Carloadings 
1927-41 5.8 5.8 18.9 10.0 118 
1946-50 6.1% 3.5* 10.51 6.0 .12 
1927-41, 1946-50 5.5 4.6 17.5 9.0 .12 
Total Carloadings 
1927-41, 1946-50 5.5 4.8 15.4 8.8 12 


* Fewer than 10 observations, 

t Fewer than 5 observations, 
tion of the ratio, Ё/ А, about unity. The value of this statistic increases 
the further the ratios depart from 1, which in this case seems to pro- 
vide a more relevant measure of dispersion*than the standard deviation 
about the mean.? 

Over the entire period studied the use of this measure of accuracy 
indicates that the shippers’ forecasts, on the average, deviated about 
10 percent from actual shipments in an individual quarter in the pre- 
war years with the error declining to 6 percent in the postwar years. 

When the observations are grouped according to the trend of car- 
loadings, considerable differences appear between the means of the 
Various subgroups. In general, the estimates tend to be most accurate 
When no trend in carloddings is perceptible; this is especially true of 
total carloadings, for which the shippers’ forecasts came, on the 
Average, to within 4 percent of actual shipments. 

The estimates tend to be least accurate in downswings. In upswings, 
shipments tend to be underestimated on the average by about 5 per- 
cent, which contrasts with the average overestimate of 17 percent in 
Periods of contraction. This striking difference is also borne out by 


* Actually the two are related by the expression: 


E/A —1\: 
9%(Е/А—лу = «7/4 Е bu ( ] 
ФЕ/А 


3 The greater accuracy of the forecasts in the postwar period is attributable, at least in part, to the 
lesser amplitude of fluctuation of shipments in this period relative to that in the prewar years, which 
Would leave smaller margins for error, other things being equal. 
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analysis of the individual ratios. In every one of the 26 instances of de- 
clining carloadings, shipments were overestimated, whereas among the 


31 instances in which carloadings were rising, there were 26 cases of 


underestimates. The average absolute error of the forecast is also seen 
to be very much smaller in rising periods. 
Do these results indicate that a real tendency exists for the shippers 
to err more on declines than on rises or are the greater errors on down- 
: swings primarily the result of the greater intensity of declines during 
the period studied?‘ On examining the data, we find that facts can be 
marshaled to support both hypotheses. During the period of observa- 
tion, the average decline, in those quarters in which declines occurred, 
was 20.3 percent; the average of the rises was 17.2 percent. In addition, 
the coefficient of determination between the percentage error in the 
forecasts and the percentage change in carloadings is .54 for the de- 
clines and .40 for the rises. Thus, the declines were in general sharper 
than the upswings, and large errors in forecasting do tend to be asso- 
ciated with large changes in carloadings. On the other hand, neither 
the difference between the average annual decline and the average an- 
nual rise nor the extent of association between the errors in the fore- 


casts and the magnitude of change is substantial—clearly not so pro- | 


nounced as the difference between the average errors of the forecasts 
on declines and upswings. Nevertheless, some allowance for these dif- 
ferences seems necessary, and a simple means of making such an allow- 
ance lies in computing the regression of the ratio of expected to actual 
shipments on the percentage change in carloadings for declines and up- 
Tee separately. The results (for nonfarm carloadings), are as fol- 
ows: 

Declines: i А 
Е/А-102%--.733% dedine 

Rises: 

Е/А=103%—.385% increase 

The regressions indicate that, on the average, a 10 percent contrat- 
tion in carloadings increases the overestimate in the forecasts by 75 
percentage points, whereas a 10 percent increase in carloadings №" 
creases the underestimate by 3.85 percentage points. It seems, there- 


4 Another alternative, suggested by a referee, is that the results are а peculiarity of the statisti? 


used. For if E and A are independent of each other, the expected value of 2. | (Е/А)!/ will xed 
unity, and much more so when E exceeds A than when A exceeds Е, for the same set of alternatives, 
Though this bias in the statistic may account in part for the observed phenomenon, it is hardly d 
to constitute a major cause because, even under the assumption of independence, the range of the 1 
ЕЈА, is relatively во narrow—generally between .8 and 1.2—that it could not give rise to өші larg? 
differences in accuracy as noted above. 


| 
| 
| 
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fore, that the shippers do indeed tend to err more when carloadings de- 
cline than when carloadings rise.5 » 

Ап additional bit of evidence favoring the second hypothesis is pro- 
vided by the existence of much the same phenomenon in the postwar 
period, as is shown by Table 1. The overestimates in the prewar period 
might have been explained on the ground that 1929-32, when most of 
the declines occurred, was not a representative period. The general 
expectation at the time was one of permanent prosperity, and in its 
initial phases the decline was considered to be а purely temporary phe- 
nomenon. At another time, it could be argued, such an erroneous expec- 
tation is not likely to be present. However, such an explanation is 
clearly not valid for the postwar years—in particular for 1949 when 
declines were widely expected. Yet the largest overestimates occurred 
at the end of that recession. 

Correlation of actual and anticipated rates of change. Comparison of 
the level of expected and actual shipments, though useful in itself, is 
an insufficient indicator of the accuracy of the forecasts. The forecasts 
and the actual shipments are necessarily related to each other because 
of the serial correlation in the data, i.e., because of the correlation of 
both E, and A, with А, 1. For this reason, E, and A; may both be of the 
Same general magnitude, with fairly high correlation, although the 
direction of change may be missed altogether. A more reliable indicator 
of accuracy is therefore obtained by removing this spurious element 
and comparing the anticipated and actual rates of change, i.e., Ei Ata 
and 4,/4, 1, rather than the actual levels. i 

An analysis based on Ё,/Аь 1 has the drawback of introducing the 
problem of seasonal variation, Rather than attempt removal*of the 
Seasonal component with the more or less doubtful standard tech- 
niques, it was decided %о utilize a much simpler technique made pos- 
sible by the availability of data for a substantial number of years, 
namely, to analyze separately each quarter of the year. Unless the sea- 
Sonal pattern changes markedly from year to year, this method seems 
most suitable for the present purpose. 

The coefficients of correlation between #;/A, and А,/ А, obtained 
for aggregate nonfarm carloadings for the prewar period are as follows: 


The questi 2 z the downswing regression may not be biased 
stion might be raised whether the slope of tobe little likelihood 


upward by the presence of extreme values in those two series. However, there seems 

бі тін а bias because, even though the. vo decline exceeds the average rise, there are almost as 

many extreme rises (8) as there are extreme declines (4). In any event, the elimination of the four ex- 

treme changes from the downswing data, which almost equalizes the average decline and average rise, 

duds to the regression: В/А =103% +.667% decline. The above conclusion would therefore remain 
me, > 
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Correlation 5% level of significance 
Quarter coefficient (absolute values) 
18 —.14 .53 
2nd —.24 .51 
3rd —.43 .51 
4th .44 .51 


Clearly, the shippers' forecasts completely failed to anticipate the 
rate of change of shipments. In the first three quarters the correlation 
is negative; when shippers are optimistic and anticipate а (more than 
seasonal) rise, actual shipments are more likely to fall than to rise. Only 
in the fourth quarter is the correlation positive, but it is very small. 
In fact none of these coefficients is significantly different from zero at 
the 5 percent level of significance. On the whole, therefore, the correla- 
tion between actual and anticipated shipments seems to be approxi- 
mately zero. If it differs from zero, it is more likely to be negative than 
positive.’ 

This might seem a rather astonishing result. Actually, however, all 
that it shows is that the shippers’ forecasts are not unlike other fore- 
casts: they provide a good idea of the general level of business condi- 
tions—which is not surprising, considering the short period ahead for 
which the forecasts are made—but scant evidence as to the direction 
of change. The latter is the crucial problem in forecasting business con- 
ditions, and the shippers’ forecasts do not seem to supply the answer, 
at least so far as the aggregate of all reporting industries is concerned. 

It might be noted that much the same results were obtained when 
the same techniques were applied to commodity groups and to selected 
commodity groups within regions—reasonably’ good forecasts of level, 
much greater errors on downswings than on upswings, and near-zer0 
correlation between actual and anticipated rates of change. 


8.2 Are the Forecasts Better than Simple Projections? 


Do the shippers’ forecasts provide more accurate estimates of cat- 
loadings than might be secured through some simple projections rely- 
ing on the serial correlation in the actual series itself? If this is not the 
case, it would then seem that the information collected from the rail- 
road shippers is indeed of very little forecasting value. 


* This conclusion was supported when E,/A, s was correlated with A;/A;-1 for all quarters xm 
bined, after the ratios were adjusted for seasonal variation. The seasonal adjustment consisted of div 
ing all the ratios for а given quarter by the average value of 41/4; for that quarter, a procedure Whi 
in effect provides estimates of the ratios of the seasonally adjusted data. The resultant correlation Wi 


--.01; » 
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Two alternative simple forecasting models were used in the test, both 

based on the serial correlation in the actual shipments, The first con- 
sisted in predicting carloadings in the current quarter at the same level 
аз carloadings in the corresponding quarter of the preceding year. In 
-other words, the forecast of carloadings by this method, say E;*, is 
simply Ага. Were it not for the seasonal element, А,- would be prefer- 
able, but in the absence of seasonal adjustment A, is the most plausi- 
ble choice. 

This measure, however, has the disadvantage of making no allow- 
ance for short-run trends in business conditions, а disadvantage that is 
intensified by the use of А, instead of 4,4. Some such allowance is 
clearly indicated, which leads to the second forecasting measure based 
on serial correlation. This measure, which we may call E;**, predicts 
carloadings in the current quarter as the level in the corresponding 
quarter of the preceding year adjusted for the change in carloadings 
over the last year; in other words, А, 4(A43/4.). 

"Using these two measures, forecasts of carloadings were made for 
each quarter from 1927 to 1941, first for total nonfarm commodities 
(excluding coal and ore) and then separately for each of five selected 
commodity groups. The forecasts were segregated by the trend of car- 
loadings, and the accuracy of the forecasts obtained by each of these 
two methods was compared with the accuracy of the shippers’ forecasts. 
The comparison is presented graphically in Chart 2, which shows the 
Proportion of time (quarters) that E, is more accurate than Ё,* and 
Ej, in turn, by commodity group and trend of carloadings. The ver- 
tical dashed line overlapping each set of three bars indicates the relative 
accuracy of E, for each commodity group averaged over all three phases 
of the cycle. The shippers’ estimates are more or less accurate, on the 
average, than the simple projections to the extent that the bars and 
dashed lines extend to the right or left of the 50 percent guide line. 

A number of points are evident from this chart. One is that marked 
differences in the relative accuracy of the shippers’ estimates exist in 
the various phases of the cycle. The type of projection used is, of course, 

1 TUER % т 1 
tested vi hn nae ds Padel e Md Магы soon ы Т W. Beh 
Aud О. Н. Brownlee (e.g., Schults and Brownlee, “Two Trials to Determine Expectation Models 

Pplicablo to Agriculture,” Quarterly Journal of Economics, Vol. 56 (1942), рр. 487-406; Brownlee, 
О.Н. and Gainer, W., “Farmers’ Price Anticipations and the Role of Uncertainty in Farm Planning,” 
nee of Farm Economics, Vol. 31 (1949), pp. 266-275). However, Merle Crawford has brought to a 

Оп references on the use of such methods in the literature of the late 1920's. Exact counterpal 
ee models employed in this study were used to judge the accuracy of monthly forecasts of 
Production in Chicago for 1929-30 (King, R. B., “A Method of Appraising Short-term Fore- 
ЕЕ Journal of the American Statistical Association, Vol. 26 (1930), рр. 333-334). Similarly, the 
Ple extrapolation of levels was suggested as an alternative, and possibly superior, method of fore- 


жейде in 1927 (Comer, H. D., and Watkins, R. J., “Forecasting а Line by Itself,” Journal of the Ameri- 
Statistical Association, Vol. 22 (1927), pp. 505-507). т 7 
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ACCURACY OF SHIPPERS EXPECTATIONS RELATIVE TO SIMPLE  PROJECTIONS|: 


1927 - 1941" 
COMMODITY GROUP PROPORTION OF TIMES ACCURACY OF Е, PROPORTION OF TIMES ACCURACY OF Ey 
EQUALS OR EXCEEDS Е; EQUALS OR EXCEEDS Ej 


0 0 20 30 40 50 60 70 80 9000 
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ШШШ "апке ч Е.А, 


est Ata 
i 


also highly relevant, and interacts with the phase of cycle in determin- 
ing the relative accuracy of E,. When E,* is the yardstick, the shippers’ 
estimates are superior except in level periods. This is only to be ex- 
pected, for when carloadings remain level, Ё* = А,_ is bound to be 
highly accurate by definition. 4 
As compared with E,**, the shippers’ estimates tend to be superior 
much more frequently when carloadings are rising or level than when 
they are declining, although the overwhelming superiority of Ё; in the 
level phases of iron and steel and of. agricultural implements carloadings 
is somewhat misleading because there are fewer than five observations 
in each case. The shippers’ estimates measure up very poorly ag 
Е** when carloadings are falling. Inspection of the individual observ 
tions indicates that the 1929-32 depression was the period when this 
discrepancy was greatest. When shipments were declining during this 
period, E;** came very close to the actual figures, whereas overestimates 
by the shippers of as much as 50 percent were not uncommon. à 
Another measure of the forecasting value of the shippers’ estimates 
was obtained by comparing the sizes of the errors committed by the 
shippers and by the mechanical formulas instead of just counting Abe 
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number of times one forecast proved superior to the other. If the ship- 
pers’ forecasts are only slightly less accurate than the mechanical for- 
mula very frequently but happen to be appreciably better at certain 
crucial times, such as cyclical turning points, these two means of evalu- 
ation may produce very different results. To test this possibility, the 
average absolute deviation of the shippers' forecasts from actual ship- 
ments, i.e, >| (A,—E)/A4| /N. was compared with the average ab- 
solute deviation of E;** from actual shipments, i.e., 9| (4.—E7**)/4,| / 
N for each of the five industries, and for total nonfarm carloadings. 

The results are presented in Table 2, covering the prewar and post- 
war periods separately. The figures in this table essentially support the 
previous findings with regard to the prewar period. The mechanical 
formula proves more accurate than the shippers' forecasts for every 
industry, as shown in the last column of the table. If we call the av- 
erage absolute deviation of the shippers’ forecast, А, and that of E;**, 
B, then this column is (В- А)/В. When this value is negative, E;** is 


TABLE 2 


AVERAGE ABSOLUTE DEVIATION FROM ACTUAL SHIPMENTS OF 
SHIPPERS’ FORECASTS, Е, AND OF E;**, BASED ON MECHANI- 
CAL EXTRAPOLATION, SELECTED INDUSTRIES 


Relative 
_E** a 
Industry > A Жы > eng accuracy of 
4, А, shippers’ forecasts 
1928-1941 
Tron and steel 18.8% 20.5% — 9.0% 
Lumber 9.8 15.2 —55.1 
Flour eure uus 6.5 - 4.8 
Cement 9.8 12.4 -26.5 
Agricultural implements 20.3 21.0 - 8.5 
Total nonfarm 8.8 10.0 —20.5 
1946-1950 
ОСЫ 2-22-20 e c. 
Iron and steel 18.2 13.4 +26.4 
Lumber 10.3 10.7 - 3.4 
Flour 7.8 6.2 +20.5 
Cement 10.4 9.7 + 7.7 
Agricultural implements 11.8- 5% 13.1 —15.9 
Total nonfarm* 5.4 6.4 —18.5 
nee eG а 0o 2 0810s SB co T 


ign Second quarter, 1947-50. Total nonfarm carloadings in 1945 included shipments of war EM 
which could not be separated out of the total. 
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more accurate on the average than the shippers' forecasts, and when 
it is positive, the reverse is true. These figures show that E;** is particu- 
larly superior for iron and steel, lumber, and cement—these are also the 
industries for which Z,** was more frequently accurate than the ship- 
pers’ forecasts in Chart 2. 

In the postwar period, however, the shippers’ forecasts appear in a 
more favorable light. They are more accurate than the mechanical for- 
mula for three of the five industries tested, lumber and agricultural im- 
plements being the two exceptions. For 1947-50, both the agricultural 
implements and the lumber shippers’ forecasts are also more accurate 
than those of the mechanical formula. The margin of accuracy in favor 
of the shippers’ estimates is as high as 26 percent for iron and steel and 
20 percent for flour. 

The results may indicate some degree of permanent improvement in 
the relative accuracy of the shippers’ forecasts or they may be due to 
the special circumstances prevailing in the postwar years. That the 
second factor accounts for at least part of the improved accuracy of 
the shippers’ forecasts is probable because (a) greater errors were found 
in the shippers’ forecasts on downswings than on upswings and there 
was a positive relation between the amount of error and the magnitude 
of change in shipments, and (b) the frequency in the postwar years of 
strikes and other special factors, which the shippers could foresee and 
which places a mechanical formula at a distinct disadvantage. In more 
than one quarter there is clear evidence that the shippers anticipated : 
a strike and modified their estimates accordingly. Because of the nature 
of the mechanical formula, the effect of such action is to favor the ship- 
pers’ forecasts not only in the quarter in which the strike occurs but in 
three others as well. з 

АП in all, therefore, these findings seem to'indicate that the fore- 
casting value of the shippers’ estimates relative to possible mechanical 
devices may be substantial only in periods in which special factors, such 
as labor difficulties and limitation of output by capacity, prevail. This 
does not mean, however, that these data might not be useful for im- 
proving the forecasts at other times when taken in combination with 
other factors. 

4. STRUCTURE OF THE FORECASTS 


The question to which we address ourselves in this section is: What 
are the major factors that explain the expected level of shipments, 1.6 
what is the structure of the forecasts? Accuracy of the forecast is of 2 
concern to us here, for irrespective of their accuracy we want to deter- 
mine how well we can “forecast the forecasts”. 
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4.1 The Basic Hypothesis: Extrapolation of Recent Experience 


The hypothesis encountered most frequently in discussions of the 
formation of expectations and in statistical models of our economic 
system is that expectations represent extrapolations of recent experi- 
ence. The assumed extrapolation might be one of level or an extrapola- 
tion of recent rate of change. < 

The “extrapolation of level” hypothesis can be stated symbolically 
as follows: 


(1.1) Е, = Ааа шщ 


where и, denotes an “error” term which might be random ог might it- 
self represent the influence of other variables, but which, by hypothesis, 
should be small. 

The “extrapolation of trend" hypothesis might be stated in the form 


(1.2) E, = Acad аА — Ave) + ид. 


If ais positive (but smaller than unity) as seems to be usually assumed, 
an extrapolation of recent trend is indicated. However, a could also be 
negative. Though little attention seems to have been given to this pos- 
sibility, it would be better to refer to this‘case as a reversal of trend 
rather than as an extrapolation of trend; for a negative a would imply 
that activity is expected to contract below the present level whenever 
ап expansion has occurred in the previous period, and vice versa. 


42 The Problem of Seasonal Variation 


Any attempt to test hypotheses of the form (1.1) or (1.2) when quar- 
terly data are involved raises, the problem of seasonal variation. This 
18 particularly true in the present study because sizable seasonal fluc- 
tuations exist in shipments for individual commodities as well as for the 
over-all aggregates. | 

In the presence of such pronounced seasonal variations, hypotheses 
such as (1.1) and (1.2) cannot reasonably be tested directly from the 
Taw data unless we are willing to make the absurd assumption that 
shippers are completely unaware of seasonal variation in their ship- 
ments—a hypothesis convincingly disproved by the results. If the ship- 
Pers are aware of seasonal variation, then it must be assumed that some 
adjustment is made for the effect of recurring seasonal changes in pro- 
Jecting recent experience. Obviously, then, some kind of adjustment of 
the data for seasonal variation is required before any reasonable type 
of extrapolation hypothesis can be tested. 3 

One approach to this problem would be to estimate coefficients of 
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seasonal variation from the data, use these coefficients to remove the 
Seasonal component, and then proceed with the tests. This procedure is 
nevertheless not entirely satisfactory for the purpose. For one thing, it 
is well known that standard methods of eliminating seasonal variation 
from data are unsatisfactory in many respects, particularly in the arbi- 
trary means involved in obtaining the seasonal coefficients. A second 
objection derives from the basic purpose of this analysis, which is to 
“explain” how shippers form their anticipations. It is clear therefore 
that the question to be asked is not: “What is the best method of ad- ` 
justing data for seasonal variation?” but rather “What do we know 
about the way in which shippers actually make adjustment for seasonal 

variation?” { 

Information on the estimating methods used by the individual ship- 
pers was obtained partly through a mail survey of commodity chair- 
men of the regional boards and partly through interviews with ship- 
pers and officers of some of those boards. Although the scope of the 
survey was very small, complete unanimity of opinion was expressed 
regarding а wide range of commodity groups and regions. In almost all 
instances where a commodity was subject; to seasonal variations, ilie 
shipper indicated that he obtained his forecast by adjusting his actual 
shipments in the corresponding quarter of the previous year for changes 
taking place during the intervening year and for any unusual conditions 
that prevailed in that quarter or that were likely to prevail in the quat- 
ter for which the estimate was made. In other words, the shippers them- 
selves rely upon an implicit method of seasonal adjustment based on 
their use of the corresponding quarter of the previous year as the start- 
ing point for their forecasts. 

Another factor favoring the use of implicit seasonal adjustment is the 
manner in which the AAR requests the forecasts, namely, as a percent- 
age of the shippers’ carloadings in the corresponding quarter of the 
Previous year. Because of this fact, the individual shipper is inclin 
to prepare his forecast with reference to that earlier quarter. In 10 
instance was there any evidence of a shipper’s using his level of cab 
loadings in the immediately preceding quarter rather than Ага 88% 
base for his forecast. | 


4.3 The Extrapolation Hypothesis 


If the forecaster starts from actual shipments in the corresponding 
quarter of the year before, i.e., А, 4, and extrapolates the latest level 
of activity, which is represented by А, he will have to adjust for the 
growth or decline which has already occurred between Аз and Ait 


MEASURING BUSINESSMEN'S EXPECTATIONS 399 


The intervening change, however, cannot be simply measured by А, 
— Ал (or some simple variant thereof), because this difference is itself 
affected by seasonal variation. Lacking seasonally adjusted data, a sim- 
ple approximation to intervening change, not affected by seasonal vari- 
ation, will be represented by the change occurring during the entire 
past year which is, in absolute terms, 4,-1— A:s; or, in proportionate 
terms, (4&3 —4.3)/A4 cs. 

If this adjustment for the intervening change is applied to 4,4, we 
obtain а formula of the type 
А E) “ ы р 

Ais Ars 

This is precisely the “mechanical formula” used to compute Ej" in 
the accuracy test in Section 3.2.8 

If it is true (1) that the shippers rely on an indirect method of sea- 
sonal adjustment which consists in adjusting А; г for the change oc- 
curring during the past year, and (2) that expectations represent pri- 
marily an extrapolation of level, then we should expect to find that the 
expression on the right-hand side of (1.3) largely accounts for the ob- 
served fluctuations of E. 6 

However, it is clearly desirable to set up а statistical method to test. 
separately the tenability of the two hypotheses just stated. To achieve 
this result we might begin by testing a more general form of hypothesis 
(1.3), namely 


(1.3) E, = Аа (% + 


ЖЕЛЕЗІ 
(1.4) Е, = а ВА, + aJ II). 
Ats 


Then, (1.4) coincides with (1.3) when 
(1.5) а=0; b edo. 


If both hypotheses were correct, the regression coefficients obtained 
by fitting (1.4) to the data should approximately satisfy the three con- 
ditions (1.5). If, on the other hand, the first two conditions of (1.5) were 
Satisfied but not the last, it would indicate that the shippers do rely on 
an adjustment of А, in preparing their forecasts and that expecta- 
tions do not represent an extrapolation of level. j 

| t might be argued, however, that (1.3) represents in reality some- 
thing more than a mere extrapolation of level. For the adjustment of 

5 i ij "y " " " test 
me ы ee iren 
tion from quarter 1-1 to quarter ё, 
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the level А, for seasonal variation by means of the ratio A c4/ Ac. also 
tends to raise this level to the extent that А, 4/4, 4 incorporates an ele- 
ment, of trend. Assuming a rising linear trend, the element of trend in 
this ratio would be one-fourth of the total change due to trend in any 
single year, or one-fourth of the change due to trend from А, to Ay 
Therefore, to eliminate this trend element in projecting the level A, to 
As, which is the same as adjusting А, to the level of Aa in (1.4), the 
coefficient c in (1.4) could be as low as .75 under this assumption and 
still not contradict the extrapolation of level hypothesis. 

The result obtained from fitting (1.4) to shipments of all commodi- 
ties other than farm products for 1927-41 is: 


ДАҚТЫ Б 
(1.6) E, = .09 + .9864,., — ал (1) 
ZR 


Е? = .972. 


It is clear that the hypothesis fits the data very well. At the same 
time, the coefficient b of А, is close to unity and the constant term 0 
is close to zero. (E, fluctuates between 1.4 and 3.8, with an average of 
2.3.) However, the coefficient c, instead of being close to .75, turns out 
to be only .43. E 

These results suggest the following conclusions: 

1. Hypothesis (1.4) appears to describe remarkably well the forma- 

tion of shippers' anticipations, at least in the aggregate. 

2. Anticipations, far from representing extrapolations of recent 

trend, appear to represent a sharp reversal of trend. f 

Since the second conclusion is rather startling, le& us examine in 
greater detail the basis for this conclusicn, as well as its implications. 

The implication of the statistical results represented by (1.0) тау 
be grasped more easily by changing b from .986 to 1.0, a from .09 to 0, 
and c to .44.? With these modifications, (1.6) may be rewritten 85 
follows: 


A A 
ал) By = A aT 5604,1 А) 2. 

Ais Ars 
As noted earlier, the first term of this equation can be considered t0 
represent primarily an extrapolation of the latest level Ал crudely 80" 
justed for seasonal variation by means of the ratio А, 1/А: =. But the 


° The reason for raising c to .44 is that the most significant feature of (1.6) for the present рш 
is the difference between the coefficients b and c. This difference in (1.6) is .556; in order to keeP 
difference constant when b is raised to unity, we also raise c to 44, 


| 
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second term shows that the respondents tend to modify this extrapola- 
tion of level by subtracting from it more than half the change that has 
occurred during the past year, again crudely adjusted for seasonal 
variation by A 4/4 ,-s. Thus, if А, exceeds А, s, the projection of level 
represented by the first term is adjusted downward, that is to say, 
against the recent trend. The opposite is true when shipments have 
been falling. Note, however, that the relative position of E, and А, 1 
depends also on the value of А, 4/4, 5 and in particular on the ratio 
of the seasonals for these two quarters. The inversion of trend will tend 
to be present only in terms of seasonally adjusted data. In terms of the 
raw data an outright inversion of trend will manifest itself only when 
the change from A: to А, is large relative to 4,4/A4 cs. 

Chart 3 illustrates this possibility. Panel 1 portrays а hypothetical 
course of shipments between quarters #—5 and t—1. Shipments are as- 
sumed to be at a level of 100 in both quarters {— 5 and (— 4 and to rise 20 
percent by quarter #—1. No specific assumptions are made about the 
level of shipments in quarters {—2 and {—3 sinceit is assumed, in (1.6), 
that the course of shipmentsin these quarters hasno bearing on the fore- 
cast made at point 2-1. Between t—4 and t—1 the shipments could, in 
principle, take any course whatever. The do} corresponding to quarter 
tin this panel represents the forecast for that quarter made in quarter 
1—1 according to (1.7). This forecast represents an increase of 44 per- 
cent of 20, or 8.8, above quarter t—4 and, therefore, a decline of 56 per- 
cent of 20, or 11.2, below quarter t—1. 

Panels 2, 3 and 4 of Chart 3 illustrate the regressive character of the 
forecasts by showing the actual behavior of shipments and the actual 
forecasts made by shippers in three selected quarters, namely, the sec- 
ond quarter of 1933 and the third and fourth quarters of 1936; also, 
the anticipated shipments as computed from (1.7), shown by an arrow. 

Panel 2 is particularly interesting since it coincides with the lower 
turning point of the 1929-37 cycle; the shippers’ anticipations appear 
to have caught this turning point, although they underestimated the 
8120 of the increase.!^ The fact that the shippers’ anticipations were 80 
close to the value obtained from (1.6) raises the suspicion that this 
Success in forecasting the turning point was hardly more than chance. 
The Tegressive character of the shippers’ anticipations led them to ex- 
Pect an expansion throughout the period of contraction from 1929 to 
1932. After three years of failure, this anticipation was finally justified; 
оос еуез о ашу o ыс заде ED ксы UT 


Gu. of the anticipated rise could be accounted for by seasonal influences, since the second 
гіз seasonally higher than the first, However, the change anticipated by shippers was undoubtedly 
mote than seasonal. 
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PANEL 3 PANEL 4 
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1936 


Cnanr 3. Illustrations of Regression Phenomenon. 


but поќе (in Chart 1) that the shippers anticipated an impending col 
traction in every one of the following three quarters (third and fourth 
quarters of 1933 and first quarter of 1934), a forecast which led to com 
spicuous underestimates in two of these quarters. 


4.4 Reliability of the Extrapolation Hypothesis j 
The very high determination coefficient, obtained would normally 
lead to confidence іп the reliability іп the results, In this case, hoWeyeh 
the size of the determination coefficient alone is not a very reliable test 
because of the serial correlation in the data. As noted earlier, Zs and 
A; are necessarily highly correlated (r?=.85) as a result of the correla 
tion of each of these variables with А, 1. For these reasons alone, Us 
is bound to produce high correlation even if the true factors affecting 
expectations differed from those described by our hypothesis. There 
fore, alternative means must be used to test the reliability of the 1% 
sults. Two such means were used in this study, as described below. 


А 


` (1.8) A, = .25 + .886A,4 + 8234,4 
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4.4.1 Fitting Equation (1.4) to A, 

The first test involves recomputing (1.4) with А, as the dependent 
variable instead of Е,. This test aids in evaluating the reliability of our 
hypothesis in two ways: 

1. If the correlation of E, with the two independent variables was 
accounted for exclusively or primarily by its correlation with Ак, then 
we should expect the correlation of E, with these variables to be lower, 
or at any rate not higher, than the correlation of А, with these vari- 
ables, 

2. If the regression coefficients of (1.4) fitted to А, turn out to be 
substantially similar to those of (1.6), there would be reason to doubt 
the reliability of at least some of the earlier conclusions. However, since 
the correlation of A, with these variables should be due to serial cor- 
relation, the coefficients of the two variables should be relatively close 
to each other. 

The outcome of the test is 


Avi — Aus 
Aus 
Е = 90. ° 


As expected, the multiple correlation for (1.8) is high; yet it is much 
smaller than that for (1.6). Tests of significance reveal the difference 
to be highly significant. 

Even more impressive is the fact that the two regression coefficients 
of (1.8) are close to each other, especially when compared with the 
sizable difference in the coefficients of (1.6). This test therefore con- 
firms the absence of any sighificant tendency on the part of actual 
shipments to regress toward the past; regression is a property of an- 
ticipations and has no counterpart in the actual course of events. 


4.4.2 Explanation of the Rate of Change 

As in the evaluation of the accuracy of the forecasts, we can seek to 
Teduce the disturbing influence of serial correlation by working with 
the anticipated rate of change, E:/Am instead of with E, itself. In 
other words, we may ask: What are the factors determining the direc- 
tion and amount of change anticipated by the shippers as measured by 
Е.А, 1? The fact that E,/A is apparently uncorrelated with 
АЈА, (Table 3, line 4) represents an additional advantage in favor of 
Working with this variable, for any significant relationship established 


: between E,/4, and other variables cannot then be attributed to the 


Correlation of these variables with A/A . 
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"This approach, however, introduces once more the problem of sea- 
sonal variation, and the problem is dealt with, as before, by analyzing 
each quarter separately. If the seasonal pattern was substantially un- 
changed over the years studied, it can be shown that this method іза 
very satisfactory one for estimating parameters when the variables are 
subject to seasonal variation. In the present case, this condition cannot 
be asserted to be fully met, although indications are that no substantial 
changes in the seasonal pattern occurred.” 

The variable E;/A, can be introduced into (1.4) by dividing both 
sides of the equation by A;4. Doing во, and taking into account the 
fact that the constant term in this equation was found to be close to 
zero, we have: 


E, Aca Aca 
1.9 — = (b — с) — —+ ш. 
(1.9) А ( Я А У Ars n 


It should be noted that once the seasonal variation has been elimi- 
nated by limiting the analysis to one quarter at a time, А,-4/А,- may 
be expected to show minor fluctuations since it represents essentially 
the change between the same two consecutive quarters in each year. 
It is therefore unlikely $o contribute much to the explanation of 
E,/A,-1 because the seasonal for any given quarter is constant if the 
seasonal pattern does not change over time. 

The main results of this test are summarized in Table 3. Columns 
2 to 5 of this table show the results obtained for each quarter separately 
and column 6 contains the same data for all 55 quarters combined, 
after adjustment for seasonal variation by the procedure described 
earlier.2 

This table strongly supports the earlier conclusions. From the first 
row of the table we observe that the correlation of E,/A:1 with 
Ara/A is, in all cases, positive and very high, generally about .9. 
This means that shippers tend to anticipate expansion when shipments 
һауе been falling and to anticipate contraction when shipments have 
been rising. The extent and regularity of this peculiar phenomenon 18 
brought out in the scatter diagram of Chart 4. In this chart, which is 
based on all 55 quarters after seasonal adjustment, the anticipated rate 
of change, H,/A,,, is plotted against the rate of change over the pas 
three quarters, 4, ,/4, ,, which is the reciprocal of A;4/A;. In this 

; In addition to this method, seasonal variation was eliminated from the data in some cases PY 
ak estimates of the seasonal factors, as will be discussed later. The two methods yielded 


-. The seasonal factors used in the correction are as follows: first quarter, .064; second queri 
1,135; third quarter. .968; fourth quarter, .984. 


| 
[ 
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TABLE 3 


MULTIPLE CORRELATIONS ON EXPECTED AND ACTUAL 
RATES OF CHANGE OF SHIPPERS’ FORECASTS 


(1) (2) (8) (4) (5) (6) 
Row* Qı Q: Q: Qi Total 
1. те .93 .93 .91 .82 .90 
2. rs .27 —.04 522870 .02 .01 
3. та —.36 —.21 -.41 .04 —.20 
4. ти —.14 —.24 —.43 .44 -.01 
5. тиз .94 .94 .90 .90 .91 
6. таз —.42 —.23 —.43 .24 —.16 
7. Тал 41 .35 .09 .64 .42 
8. тал .35 =13 AT .46 .21 


* Xi cEi/Aty Xy Aca Aci Xi Aca/ At, Xo А,/ Ааа. 


way the chart enables the anticipated rate of growth to be compared 
with the past rate of growth. As is illustrated by the scatter chart, when 
shipments have been rising, shippers anticipate contraction, and vice 
versa. Shippers’ anticipations tend to be against the recent trend, and 
the more so the stronger the trend. о 

This trend-reversing character of anticipations finds no justification 
or counterpart in the actual rate of shipments, as can be seen from the 
third row of Table 3. The correlation between A;/A;+ and Ay4/Ar4, 
far from being large and positive, is slightly negative in three of the 
Quarters and practically zero in the remaining one. For all quarters 
Seasonally adjusted it is negative, though not significantly different 
from zero. It therefore follows that the high positive correlation be- 
tween Е,/4, 1 and A,4/A;1'can Бе explained only by the unusual 
manner in which shippers’ anticipations are formed. This is due to the 
fact that in forecasting the next quarter shippers discount the change 
that has already occurred and anticipate that shipments will regress 
from А, , toward A a. 

The last four rows of Table 3 present the results obtained when 
A/A 5, or Ху, is introduced into the regression relationship. As ex- 
pected, this variable does not contribute much to the correlation, 
though significantly related to Ё,/ А, when Хз is held constant. 


45 Extension of Basic Hypothesis 


Either or both of the two factors involved in (1.4) may in reality be 
modified by other considerations. Thus, in regressing toward the past, 
the shipper may regress toward Ара alone, toward As modified by 
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Снавт 4. Scatter Diagram of Е,/ А, with А.А. 


А, 4/А, зо allow for year-to-year trends, or toward the average value 
of past shipments in the given quarter as measured, say, by 


(Avs + Atte Ар) à 
#1. 


Similarly, in adjusting the past level of shipments for intervening 
growth, the shipper may use the change in shipments from quarter 
1—5 to 1—1 alone, or with allowance for the rate of change in shipments 
by including some such factor as (43/4. 5) / (A c/ A в), or, in addi- 
tion, the change in shipments from quarter (—9 to quarter 2-5. One 
could also consider changes in the rate of change of shipments and pur 
ilar trends, but the later empirical results do not provide much indica- 
tion that such additional factors would be significant. 

By combining the level-factor hypotheses with the adjustmentdat- 
tor hypotheses, a number of functions were constructed for empiti¢ 
study. Of those, the ones that worked out best (in terms of goodness " 
fit, significance of coefficients, effect of substituting A, as dependens 
serial correlation of the residuals, and accuracy of postwar estimated 
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are shown in Table 4. АП of these functions fit the data very well. The 
use of ratio variables yielded somewhat closer fits than the use of dif- 
ferences, but there was little to be said regarding the merits of loga- 
rithmic versus arithmetic forms, 

The results suggest that the rate of change in shipments, (A;1/A:-5) 
/(A1:-2/A1-s) plays some role in the forecasts. The addition of this vari- 
able is not only statistically significant, but also reduces the serial 


TABLE 4 


ESTIMATES OF PARAMETERS OF SELECTED FUNCTIONS 
EXPLAINING SHIPPERS' FORECAST 


4 6) (0 
"m в) ©) p Average absolute 


Hypothesis Function® Бұй- Aide рр error 
pendent pendent 14.01 1947-50 


A E,7.087-1-.556** A, «+. 430** -1 93 .%0 3. 2.4 
ЕЛҮ єк му Aca /453)—1] 0% % 


Р Е=.073--.467** A, „411** e —1] .973 .899 3.0 2.6 
Я "i. a 2/4 5)—1] 
AM Ay ее -1| 
(.070) (45/43) 
J log E,--log .022-1-.972** log А .A51** 1, МА «9796. .893 4.0 1.7 
т: ms log [4:2/: Eds 
—.080* log [At 2/4: 4] 


1081) 
L ов E,=log .022-4-.972°*log At at.424** log [Ae 1/Ay a] 2083 .84 40 16 
oon) со) eA ag) el 


(.025) 
Ata Ac 
4.092 log | —— —— | —.047 log [Ay 4/Ar_s] 
(.063) 5 ‚Аг / Aca. (unl 


level, вигов In parentheses are standard errors of coefficients. One asterisk indicates significance at the .05 probability 
vel; two asterisks, at the 01 significance level, 


Correlation in the residuals of the other functions to a level where it is 
no longer statistically significant. The positive sign of this coefficient 
Suggests that the regression of expectations toward the past is some- 
what modified by the recent rate of change of shipments. At the same 
time, the coefficient of this variable indicates that the extent of this 
Modification amounts only to about 10 percent of the recent rate of 
change so that regression toward the corresponding quarter of the 
Previous year still remains the dominant pattern of the forecast. | 
The hypothesis that the shipper adjusts Аг for recent trends by 
Means of the ratio А, 4/4, s proved to be the best of the level hypoth- 
eses. Particular interest attaches to the fact that the coefficient of this 
Variable is negative, indicating that the forecasts tend to be lower when 
A is large relative to Ass, and that the level toward which the 
expectations regress is some average of А; and A:s, instead of merely 
teks 
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These tests would therefore seem to lead to the conclusion that the 
hypothesis best explaining the structure of the over-all forecasts is of 
the nature: 


Е,- 


А ся Ё] < ape | — A 
ЛАА Q^ AL. 

Though this function fits the data very closely, it is disappointing that 
the residuals show some evidence of positive serial correlation, at least 
at the .05 level of significance. This is rather surprising in view of the 
fact that no significant serial correlation appears in the residuals of the 
same function excluding А, 4/4, s, i.e., hypothesis J; and it may indi- 
cate that at least one other relevant systematic factor is being omitted, 
However, tests with various other possible variables proved unsuccess- 
ful in uncovering the missing factor. 

The last two variables in this function also do not appear to be 
statistically significant. № evertheless, the fact that each variable taken 
separately is significant (hypotheses F and J ), and that a combination 
of the two variables significantly improves the goodness of fit of the 
function, seems to bear out the relevance of these variables. 

The last three columns present additional evidence on the reliability 
of the results. Column 4 shows that the goodness of fit of the functions 
with A, dependent is also very good, but nevertheless nowhere near as 
high as when E, is the dependent variable. The last two columns present 
what is perhaps the acid test of a satisfactory forecasting function, its 
accuracy outside the period of observation. The history of economic 
statistics is replete with hypotheses which were highly successful with 
reference to the period under study but which proved ineffective when 
used for forecasting. 

In the present case, the regression functions are, if anything, appar 
ently more accurate for predictions than they were during the period 
observation. The probable explanation for this phenomenon, however, 
is simply the greater amplitude of the fluctuations of the shippels 
estimates during the period of observation. To test the plausibility of 
this explanation a “coefficient of determination of prediction” for hy- 
pothesis Z was computed as 1 minus the ratio of the variance of the 
Postwar residuals to the variance of the postwar shippers’ estimates. 
The result, .79, is a good deal below the corresponding determination 
coefficient of .98 for the period of observation. This is about what 
would be expected considering the nature of the postwar period, ш 
frequency of strikes, and the relative accuracy of the predictions.” 


" Іа practico, allowance was made for unusual events in a particular quarter by omitting the 0/8 


у ( Ша Ae Ac A =) 


MEASURING BUSINESSMEN'S EXPECTATIONS 409 
5. IMPROVEMENT OF THE FORECASTS 


Of the numerous ways in which attempts might be made to improve 
the forecasting value of the shippers' forecasts, attention was focused 
on the light the residuals of the hypotheses might cast on the course of 
shipments. Two distinct questions may be raised in this connection, 
namely: 

1) Do the residuals of the functions estimating anticipations, 
Е,- Et, seem to be associated with the deviations of actual ship- 
ments from the function estimate, i.e., A4,— E? If the deviation 
of expectations from the explained component of Е, is in the right 
direction, it would indicate that these deviations tend to improve 
the forecasts, and also suggests the possible omission of other 
factors influencing expectations having а more direct bearing on 
actual shipments. : 

2) 1f the answer to the first question is in the affirmative, can we 
make use of this information to derive a more accurate prediction 
of actual shipments than is obtainable through the use of the ship- 
pers' anticipations alone? Of the many forms such an attempt 
might take, limitation of available resources necessitates restrict- 
ing ourselves to inserting U; as an additional variable in a regres- 
sion relating А, and E, i.e., taking А, as a function of 0, and of 
E.. What this does in effect is to increase the importance of U; 
relative to E, on the basis of the extent to which each of these 
variables is associated with 4,; and from a forecasting point of 
view this is exactly what we want, if U, is in the direction of ac- 
tual shipments. 


21 Comparison of Residuals« 

„Опе means of answering the first of the questions raised in connection 
With the use of the residuals is to compute the partial correlation of 
Ai on E, holding E; constant. These partial correlations are shown in 
Table 5 for all manufactured commodities and for each of five selected 
industries. The values of Е, for total manufactured commodities are 
based on hypothesis L, and for individual industries are based on the 
most appropriate hypothesis in each case. 


197 that quarter from the analysis, if some adjustment could not be made. In most instances, however, 

adhstments were possible. For example, the most frequent of the unusual events ін labor stoppage, 

ti tom information on the duration of the stoppage and its effect on industry employment or produc- 

5a п, the amount of production or shipments that would have occurred in the absence of the stoppage 

m be estimated fairly well. Unfortunately, it is very difficult to make a similar adjustment in the 
їРрег'в forecasts, аз there is generally no easy way of measuring the extent to which the event may 
ve been anticipated, 
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Except for flour, some tendency apparently does exist for the 1 
siduals of the forecasts to be associated with the deviation of A, fro 
Еу. This tendency does not seem to be very great, though the correla- 
tions are significant at the .05 probability level (гв = 336) and some | 
also at the .01 significance level (r..=.410). The fact that these five 
correlations are significant, however, would indicate once again that 
relevant variables are omitted from our hypotheses despite the very 
close fits obtained for them. 

Examination of these relationships by phase of the cycle also revealed 
little evidence of association between the residuals, with one outstand- 


TABLE 5 


PARTIAL CORRELATION COEFFICIENTS OF А, ON Е, 
A HOLDING Ес CONSTANT 


{ г) 1 E 
Industry Partial correlation 


coefficient, 
Iron and steel 41 
Lumber .46 
Flour -.25 
Cement ? .38 
Agricultural implements .45 
АП manufactured goods 44 


аа о. OMM 
ing exception, the 1937-38 recession. In the last three quarters of 1938, 
when the use of E; would have led to substantial overestimates, the 
shippers departed markedly from this formula in their forecasts and 
anticipated shipments much better than the formula itself. The same 
phenomenon in reverse occurred in the last quarter of 1938 and the first 
two quarters of 1939. M Б 

For individual industries, the residuals are correlated much more 
closely in the later years of the prewar period, from 1937 to 1941. Lum а 
ber and iron and steel shippers in particular seem to have anticipate 
very well the fluctuations of their industry's shipments during this ” 
period; the coefficients of determination between 4,— Ег and Е-Е 
from 1935 to 1941 for these two industries are .49 and .81, respectively: 
For the postwar period, however, little relationship was detected be 
tween the two sets of residuals, 


6.2 Use of Residuals to Improve Accuracy of Forecasts 
Since some relationship was detected between the residuals in the 
ыс eee 
М For the entire prewar period, the corresponding coefficients are .18 and .16, respectively- 


EM ei k 
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prewar period, 1% would seem worthwhile to investigate the effect of the 
residuals on the forecasts. Multiple regressions of А, on E; and U, were 
computed for each of the five industry functions and for total manu- 
factured commodities for the prewar period. Estimates of the relevant 
correlation parameters are presented in Table 6. 

The parameter in this table of principal interest from the point of 
view of the present analysis is 7, the partial correlation coefficient 
of A, with U, when E, is held constant; this is the same as the partial 
correlation of A, with Ер, holding E, constant. If the residuals do pro- 
vide some indication of the future course of shipments in the prewar 
period, some correlation between the residuals and actual shipments 


TABLE 6 


EFFECT OF RESIDUALS ON ACCURACY OF SHIPPERS’ FORE- 


ү CASTS, BY SELECTED INDUSTRIES, 1927-41 


ching Tron Agric, 
Measure* T and Lumber Flour Cement imple- ~ 
commodi- ШАН о 
ties Өв 
Ti .92 .84 .93 .84 .96 93 
та .20 .24 118 -%1 11 119 
713.2 .10 .07 .19 74. -.04 ‚12 
Rin .92 .84 .94 .84 .96 93 
Xid; 
Xi-E, 


Xı=U;, =E; -Ef 


should be evident when the shippers’ forecasts are held constant, In 
addition, our hypothesis as to. the nature of the relationship between 
the residuals and actual shipments postulates that any such correlation 
that exists should be positive. 

The estimates of ris2 in Table 6 do not provide any evidence that 
the residuals are of value in indicating future trends. The values of 
Тал are in the right direction for all except one of the six cases, but in 
no instance is the estimate of the coefficient statistically significant at 
the .05 probability level. As a result, the addition of the residuals to 
the regression function fails to provide any noticeable improvement in. 
the goodness of fit, as is evident from comparing the values of ra and 
the corresponding one of В; әз. Although the deviations of the forecasts 
from the function are in the right direction, the relationship is appar- 
ently not; systematic enough to serve as а basis for forecasting. Much 
the same results were found in the case of region. 
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All in all, therefore, it would seem that improvement of the accuracy | 
of the forecasts through the addition of the residuals of the shippers’ 
estimating function as an extra variable in the regression of A; on Ё, 
does hold some promise, but only if used on a selective basis. In other 
words, judicious use of the residuals for some industries and not in 
others might lead to greater accuracy not only for those partieular 
industries but on an over-all basis as well. Г 


6. SUMMARY 


The main findings of this study revolve around the questions of the 
accuracy and the structure of the shippers’ forecasts. On the subject 
of accuracy, we have obtained the following main results: : | 

1. The forecasts tend to lag behind observed changes. Turning | 

‚ points, in particular, are almost invariably overshot by at least one 
quarter. Although the average percentage error of the forecasts is not 
high, the rate of change of shipments is generally missed altogether, 

2. The forecasts tend to err more when carloadings are declining 
than when they are rising, and are most accurate when carloadings 
remain approximately level. — ' 

3. The forecasts do not compare favorably in general with other ele- 
mentary forecasting models but are somewhat more accurate in the 
postwar years, 1946-50. The latter may be due either to inherent m- 
provement in the forecasts or to the special cireumstances prevailing 
during this period. 

On the structure of the forecasts, the main results may be summar- 
ized as follows: ) 

1. The preparation of the forecasts appears to involve modification 
of the shippers’ carloadings in the corresponding quarter of the previous 
year for the change in trend in carloadings over the year. These two 
factors alone account for over 97 per cent of the variance in the fore 
casts. The addition of variables reflecting past rates of change increased 
the explained variance somewhat further. 

2. In applying the above adjustment, the shippers in the aggregate 
tend to allow for only a fraction of the change that has occurred in the 
past year. The result is a regression of shipment anticipations towa 
the past, particularly to 4, 4 modified by А, з. This regression р re 
nomenon, which seems to have no counterpart in actual carloadings, 
explains why the forecasted change is typically counter to the recent 
trend. 4 

3. There is some evidence that the residuals of the regression equá- 
tions explaining the formation of expectations, though very small, are 
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associated with the deviations of actual shipments from the equation 
estimates. However, attempts to improve the accuracy of the forecasts 
by correlating actual shipments with the forecasts and with these re- 
siduals met with only limited success. Evidently, the relationship is not 
sufficiently systematic to be of much help. 

These findings are not without limitations, а brief review of which. 
serves both to place the findings in a proper perspective and to point 
the way to future work in the field. In the main, four such limitations 
would seem to exist. First, the data may represent the anticipations 
of only one sector of the business community. Although the people 
represented—typically traffic managers—are of some importance in 
their firms, they are probably not on a policy level and may not be fully 
informed of the firm's future operations. 

Second, it ean not be overemphasized that the entire analysis has 
been carried out in terms of aggregates and that we have no direct 
evidence as to the frequency or even the existence of the regression 
phenomenon among individual shippers’ forecasts. Thus, this phe- 
nomenon as observed in this study might conceivably have resulted 
from extrapolation of the level of the corresponding quarter of the pre- 
ceding year by a large group of the respondents and extrapolation of 
trend by another large group. р 

Third, the results refer to quarterly data only. This study presents 
n0 information on the effect of some other time unit on the accuracy 
and structure of anticipations. Fourth, the unavailability of other data 
may exaggerate the extent to which the shippers' rely on past railroad 
shipments in arriving at their forecasts. Thus, orders data proved rele- 
vant in two of three industry-region functions for which they were avail- 
able. The removal of these limitations through further study of the 
railroad shippers’ forecasts and securing other data on expectations is a 
task for future work in this field. 


ELECTRONIC COMPUTATION IN ECONOMIC STATISTICS 


J. А. C. Brown, Н. S. HourHAKKER, AND 8. J. PRAIS < 
University of Cambridge* 


1. INTRODUCTION 


ECHNICAL advances in electronie engineering which have taken 
(ГЫ in the last decade have led to an enormous advance in com- 
puting technology with the recent development of a general purpose | 
electronic computer. The main features of this are its high speed of 
operation and the ease with which the user is able to adapt the auto- | 
matic facilities of the machine to his own particular problem by means 
of a “program”. In other words, the user is able to introduce loop- 
Systems of any desired degree of complexity into the machine with great 
facility. қ 

As is to be expected, serious discussions of the use of the computer 
have so far generally been confined to engineering and mathematical 
circles! with the result that potential.users from other fields such 88 
economics and statistics have not fully appreciated the advantages that 
the new techniques offer іп the practical solution of problems and the 
opening of new lines of research. It is the object of this paper to give a 
non-technical description of one electronic computer known to the au- 
thors and an account of some of its applications in the field of economie 
statistics. 

The ensuing discussion will be in the following order. In the second 
section a brief account is given of the characteristics of an electronic 
computer limited, however, to the extent to which this is required for 
an understanding of its applications. None of the engineering aspects 
will be discussed. The reader who is interested in further details may 
be referred to the excellent Cantor lectures [15] delivered by Dr. М. У 
Wilkes, Director of the University Mathematical Laboratory іп Cam- | 
bridge, to the Royal Society of Arts in November 1951, for a clear and 
full description of this and other automatic computers.? 

The alternatives currently provided by punched card methods and 
desk calculating machines are surveyed in the third section and some 
attention is given to the question of relative costs. This leads to à dis- 
sion in the fourth section of the delicate problem of the “economics of 

* The second author is now at the University of Chicago. Е 

1 The popular and (not во popular) philosophical discussions of the “electronic brain" require 29 
consideration here, 


? For a detailed description of programming on the Edsac reference may be made to the account BY 
Wilkes, Wheeler, and Gill [17], and the more elementary account by Hartree [6] Chapter XII- 
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programming”. In the fifth section a more detailed account is given of 
some of the statistical problems that have so far been solved by the use 
of the electronic computer, and in the final section the argument is 
summarised. 


2. A DESCRIPTION OF THE ELECTRONIC COMPUTER 


The applications to be described below were made on the Edsac, the 
Electronic Delay Storage Automatic Calculator of the University 
Mathematical Laboratory in Cambridge.* This machine, which started 
operation in 1949, is suitable for all mathematical manipulations that 
can be put in numerical form; that is, it is a digital machine and not an 
analogue machine (such as a slide rule). It will carry out sequences of 
elementary ‘orders’, such as addition and multiplication, which are 
determined in advance by the user and fed into the machine by means 
of punched paper tape together with the numerical data (if any). These 
orders and numbers are stored by the Edsac in its “store” or “memory”, 
consisting of about 1000 “storage locations”, whence they can be trans- 
ferred to the “arithmetic unit” whenever necessary. The latter unit, in 
which actual operations take place, includes an “accumulator” and a 
"multiplier register” which are analogous to the registers of a desk cal- 
eulator. Each storage location can hold one order or one “short” num- 
ber of 17 binary digits, but two short locations can be joined to ac- 
commodate опе “long” number of 35 binary digits, equivalent to about 
10 decimal digits. The machine operates entirely in the scale of two, 
although input and output are normally in the decimal system, the 
necessary binary-decimal conversion being performed by orders pre- 
Viously fed into the machine. Output from the machine is on a tele- 
printer or paper tape. 7 Ё 

The high speed at which orders are carried out (an addition takes 1.5 
milliseconds, a multiplication 6 milliseconds) makes it desirable that 
the execution of “programs” (sequences of orders properly combined) 
should require as little human intervention as possible. In principle the | 
only stimulus needed by the Edsac to take in and carry out a complete 
Program is the pressing of the start button; the “control unit” of the 
` Machine takes over from then on. This explains the word “automatic” 
In its name. In complicated programs, however, it is occasionally con- 
venient to suspend operations briefly so that the user can intervene on 
the basis of intermediate results. 

А Program therefore has to specify in complete detail (using the Ed- 
Sac order code) the elementary operations that are necessary to solve 


* For a technical description, see the articles by Wilkes and Renwick [16], and by Wheeler [14]. 


« * 
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the problem in hand, taking into account all the possible situations that 
may arise during its execution. A simple example may help to explain 
the nature of a program and the detail in which it has to be specified, 

Suppose the square root of the positive number y is to be found by 
means of the first order iterative process 


Tey = zr — в? + у 


where т, converges to /y from below if 0<y<3(3—+/5)~0.38, as is 
assumed here, and хо the, first trial value, is zero. The sequence of or- 
ders will then be as follows, supposing that y is stored in location 1 and 
тіп location 2. 


(1) Put the contents of location 2 into the multiplier register. 

(2) Multiply the number in location 2 by the number in the multi- 
plier register and subtract the result, т, from the accumulator (sup- 
posed to be clear previously). 

(3) Add the number in location 1 into the accumulator, obtaining 
0—0. 

(4) Test whether the number іп the accumulator (which for the ad- 
mitted value of у cannot be positive) is non-negative; if it is, the process 
ends (z? being equal to y) ; otherwise, proceed to the next order. 

(5) Add the number in location 2 into the accumulator, obtaining 
Tit? HY =t. 

(6) Transfer the number in the accumulator to location 2, leaving 
the accumulator clear. 

(7) Return to order (1), starting a new iteration. 


If these orders are stored in the locations beginning at 100, they are 
coded as below where the “functior letters” have the following mean- 
ings: ? 

Hn means put the number in location n into the multiplier register. 

Nn means multiply the number in location n by the number in the 

multiplier register and subtract the result from the number in 
the accumulator. 

An means add the number in location n to the number іп the € 

cumulator 

Tn means transfer the number in the accumulator to location " 

leaving the accumulator cleared. ДІ 

En means test whether the number in the accumulator is positive 

or zero. If it is, proceed next to location п ; otherwise, proc 
serially. 
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SET OF ORDERS FOR COMPUTING A SQUARE ROOT 
Location Order 


100 H 2 
101 N 2 
102 ATE 
103 E 107 
104 AID 
105 2502 
106 Е 100 


The most important order in this program is that used in (4), in 
Which the choice of the next operation is made to depend on the state 
of the accumulator at that time. Normally, orders are carried out se- 
rially according to their position in the memory, but by such “condi- 
tional transfers of control” the sequence can be altered whenever nec- 
essary. Without this facility it would not be possible to determine 
automatically whether an iterative or other repetitive process is com- 
pleted, and it is therefore an essential feature of automatic computa- 
tion In this way the whole or parts of a sequence of orders can be car- 
tied out repeatedly, as is the case here. Ifenecessary, orders elsewhere 
in the program can change some orders in a sequence (especially their 
addresses) from one "cycle" to the next one, which may lead to a con- 
siderable saving in the number of orders to be stored. Thus if it is re- 
quired to add together the numbers in locations 200 to 299 we do not 
need a hundred A-orders with different; addresses, but only one A-order 
whose address is increased by one in each cycle. 

In summary (a) the use of conditional orders, (b) the short time re- 
quired for each operation, and (c) the large number of storage locations 
for holding intermediaté results, enable the machine to perform the 
most extensive and complicated calculations with great rapidity. 


3. THE CHOICE BETWEEN ELECTRONIC AND 
OTHER CALCULATING MACHINES 


Tn any major statistical problem the effective choice which faces the 
Tesearch worker lies between the fully automatic electronic machine 
and the more usual equipment for handling punched cards consisting 
of card Sorters, tabulators, and a number of auxiliary machines. The 
choice between these two types of machine on the one hand and the 
desk machine on the other is usually too straightforward to merit de- 


ires, рет are five other orders in the Edsac code which transfer control in a number of specified 
circumstances, 
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tailed discussion: it suffices to note that the use of desk machines is ad- 
visable in the exploratory stages of a piece of analysis, to determine 
orders of magnitude, to forecast the probable character of the results, 
and to provide checks which may not be included in the program for the 
automatic machines. Occasionally some work may be left for the desk 
machines in the way of scalar transformation of the results, and one or 
two subsidiary calculations which are best omitted from the more auto- 
matic processes; but in principle there is no need for this. 

The factor determining the choice between the two major types of | 
machines may best be considered in relation to the processes which are 
carried out in solving a typical statistical problem from the recording 
of the original data to the attainment of the final results. These proc- 
esses usually comprise (1) recording the data, (2) classifying and sort- 
ing, (3) summarising, and (4) the estimation of numerical relationships. 
In general, if the amount of original data is large, the advantages of 
punched card machines are at present greatest in the first three of these 
processes and of automatic electronic machines in the last. { 

Punched cards provide a compact and cheap form of record to which 
later reference is easy, particularly if the information on the cards.is 
reproduced in numbers and letters along one edge as can be done by an 
automatic interpreter. The sorting of the cards is extremely rapid (up 
to 40,000 card columns per hour), and once the cards are sorted they 
can be quickly summarised on the standard tabulator in which a rela- 
tive slowness of individual arithmetical operations is compensated by 
the ability to carry out a number of operations in parallel, together with 
a parallel type output of the results. On the other hand the range of 
arithmetical operations which can be carried out without recording and 
feeding back intermediate results is severely restricted, so that the use 
of normal punched card machines for the estimation of numerical re- 
lationships from a relatively small amount of summarised data is rarely 
economical, 

With most of the currently operating electronic machines the cost of 
reading and printing a large quantity of information is high and it 18 
usually inefficient to use these machines where the ratio of input plus 
output time to computing time is large. Many statistical problems are 
closer in terms of this ratio to those which arise in commerce than w 
those which arise in mathematical or physical problems, and it 18 0 
interest to quote an estimate which has been made in the former field. 

Bowden [2] has recently considered the application of the electron? 
computer at Manchester University to the production of a weekly pay” 
roll for a factory of 3,500 employees, and has estimated that whereas 
all.the numerical computations could be carried out in 48 minutes the А 
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printing of the payroll by a standard teleprinter would take some 12 to 
14 hours (though recent developments in parallel output mechanisms 
will reduce this drastically). Further, the storage on magnetic drums of 
all the information which it would be necessary to carry forward each 
week would be prohibitively expensive. 

Few statistical applications have such a high ratio of input and out- 
put to computing time as this, but two illustrations may be given from 
our own experience. The first example concerned the calculation of 
some 2,100 correlation coefficients of zero, first and second order from 
a matrix of sums of squares and cross-products of order 37 X37. On the 
Edsac this was completed in 100 minutes, of which about 80 were ac- 
counted for by input and output. Nevertheless the use of the Edsac 
was economical, since the individual numerical processes were too com- 
plex to lie within the range of punched card machines. The second ex- 
ample was the formation of a matrix of sums of squares and cross-prod- 
ucts of order 8X8 from data which comprised 2,200/observations of 
each variable. This calculation was completed on punched card equip- 
ment with about 8 hours sorting and tabulating, whereas the Edsac 
would have taken about 2 hours to read the information. The computa- 
tion and accumulation of the cross-produets are carried out almost si- 
multaneously, but in view of the danger of any momentary failure 
which would have rendered the final result worthless, it would be ad- 
visable to split the operation into a number of parts which would be 
summed later on a desk machine. 

It should be stressed that the points made in the preceding рага- 
graphs are necessarily provisional since computing technology is still 
changing rapidly. On the one hand the range of punched card equip- 
ment is being extended as, for example, by the development of an elec- 
tronic multiplier with facilities for transfer between registers. On the 
other hand attention is being given by the designers of the large elec- 
tronic machines to improving input and output facilities, including 
parallel type output, and input and output devices which can operate 
at the same time as computations are in progress in other parts of the 
machine. Better and cheaper forms of semi-permanent storage are also 
to be expected, 

One further point should be kept in mind. The technical knowledge | 
Tequired to operate a punched card machine is widely dispersed, and 
Most of the methods which are useful in statistical problems are well 
established. By contrast the research worker must usually expect to 
Invest а good deal of time, sometimes many weeks, in programming а 


* For an introductory account of punched card methods in the analyses of survey data see Yates 
д ‘ 


[18] Sections 5.11 to 5.19. 
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new problem for an electronic computer. Even if electronic methods are 
clearly the best if a program exists, they may be eschewed if such a 
program is likely to be used no more than once. In general if the amount 
of time which can be spared for programming is limited, it will pay to 
invest it in the construction of programs of the most general applicabil- 
ity and to use the more traditional computing devices for problems with 
strong individual characteristics even at the cost of extra computing 
time. 

Since the technical developments are still in full progress it is too 
early to say anything definite on financial costs, but at present the cost 
of а typical full-sized high speed machine with input-output equipment 
adequate for statistical purposes in the United States is probably in 
the region between $400,000 and $1,000,000 equivalent to between 
$100 and $300 per hour of utilization. In Britain the cost per hour of 
utilization is probably between £50 and £100. Thus very roughly, the 
electronie machines are about 100 times as expensive to use as desk 
machines, and about 25 times as expensive as punched card machines. 


4. THE ECONOMICS OF PROGRAMMING 


Once the decision to use,the high-speed machines has been made, 
there remain a number of decisions to be made with regard to the par- 
ticular form of program to be adopted. For the most part the criteria 
governing the construction of programs are conflicting, and the pro- 
grammer will have to find the most efficient compromise for his pur- 
pose, In this section we discuss the five criteria which comprise the 
problem we have called the “есопотпісв of programming’. 

A complete program consists of all the orders necessary for a specific 
calculation such as the finding of the roots of an equation with given 
coefficients. This calculation will involve some operations that also 
arise in other programs, and for which the relevant order sequences can 
be prepared once for all. Of these we may mention as examples such 
common operations as division? evaluating trigonometric functions, 
integrating a differential equation and inverting a matrix; input (read- 
ing numbers from the tape) and output (printing the results) also como 
into this category. These order sequences are known as *sub-routines:; 


they are indispensable for the efficient utilization of any automatic 


computer and are made accessible to users through a “library of sub- 
routines”. Many programs consist entirely of a few library routin en 
linked together by a short “master routine" in addition to the numerica 
information. The square root program described above is strictly 8180 


* Unlike some other machines the Edsac has no built-in division. 
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a sub-routine, if only because it contains no input and output facilities. 

a. Speed. There are usually many claimants for computing facilities 
on an electronic machine and users will therefore have to economize in 
thetime they occupy. Furthermore, the possibility of breakdown makes 
Short runs desirable. Speed is an especially important consideration in 
iterative or repetitive caleulations, which frequently occur in this type 
of work. 

b. Size. The capacity of the Edsae's high-speed-memory is small so 
that ingenuity in programming is often required in order to save storage 
space and hence operating time. For instance in inverting а matrix it is 
necessary to store both the program and the elements of the matrix; 
hence the more space is taken by the program, the smaller is the order 
of the matrices which can be accommodated. This problem is now less 
important with the development of large auxiliary storage facilities оп 
magnetic tape which has rapid access time. 

€. Accuracy. In all digital machines significant figures are lost be- 
cause only a limited number of digits can be used to represent a num- 
ber. A trained human computer counteracts this almost sub-con- 
sciously by shifting the decimal point so as to retain the required num- 
ber of significant figures. The Edsac library contains some “floating 
point routines" which operate in the same fashion and relieve the pro- 
grammer from the difficult task of considering the magnitude of each 
number during the execution of a program. 

In all the more complicated programs it seems that the use of a gen- 
eral purpose floating decimal routine has much to commend it. It is pos- 
Sible to arrange this in a form so that ordinary orders are *interpreted" 
and the operations are carried out in the machine with the appropriate 
adjustment of the decimal point. The decrease in the speed of the ma- 
chine is significant, but the construction of the program is made much 
simpler, 

d. Range of application. Programs and especially library sub-routines 
are evidently more useful the greater is the number of specific problems 
for which they can be used. For example, if its other characteristics are 
the same, a routine which inverts all non-singular matrices is preferable 
to one which applies only to symmetric matrices. Similarly, it is desira- 
ble to have a wide range of convergence in iterative calculations. 

€. Ease of construction. Once a program or routine is finished the ef- 
fort spent on preparing and testing it is irrelevant to the user, but be- 
fore then there is frequently a choice between more and less difficult 
Approaches, which may be expected to require different amounts of in-. 
Yestment in construction time and to yield different results in terms of 
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the four criteria mentioned previously. Much of the construction time 
will be spent in testing; as experience has shown that initial errors in 


programming are almost unavoidable and often difficult to locate, the 


Edsac library contains several sub-routines for the detailed analysis of 
trial programs. 


The optimum balance between these five criteria will depend on the 


problem in hand. In the case of library sub-routines, especially the more 
common ones, it is well worth the effort to seek the most efficient meth- 
ods of solution, so that the ease of construction criterion becomes of 
minor importance. Even so it is not always possible to say in advance 
which sub-routine is the most efficient for incorporation in some future 
program. Thus no square root routine which is at the same time small, 
fast, accurate and applicable to all positive numbers has yet been found, 
The routine discussed earlier as an example is small and accurate but 
it does not work for у> .39 (as the solution z; Ваз to approach s/y from 
below) and converges but slowly for small values of y. It is therefore 
fast only in a limited range of the argument, which may be suitable for 
some programs but not for others. The Edsac library therefore con- 
tains several square root routines among which users may choose. An 
alternative may for instance be fast, accurate and generally applicable 
but occupy more storage space than our example. 

A substitution of wideness of application for size arises in operations 


on symmetric matrices, where much memory space can be saved by | 
storing only the elements on and below the main diagonal. In lengthy | 


‚ calculations the maintenance of accuracy is often the dominant prob- 
lem and it may be necessary to adopt floating point techniques 88 
pointed out above, even though they reduce the speed and considerably 
increase the number of orders. Sometimes the programmer may not be 
much interested in speed or size and prefer thë least arduous way of at- 
riving at a working program, so that the fifth criterion becomes de 
cisive. 

As a result of technical progress in electronic computation some of 
these criteria may change in importance. The operating speed of some 
of the most recent large-size computers exceeds that of the Edsac by 
a factor of five or ten. These machines frequently also have an auxiliary 

: “slow” memory in addition to the electronic memory, the former being 
used to store parts of the program that are not at that time being с 
ried out. There is, however, also a contrary development towards Jess 
ambitious electronic computers that can be produced commercially at 
a fraction of the cost of the larger and faster constructions; there 
and size will be matters of great concern to the programmer. 


3 
E 
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5. APPLICATIONB TO ECONOMIC STATISTICS 


We now turn to consider the applications of the electronic computer 
to econometric problems citing examples that have so far been pro- 
grammed for the Edsac. The main work has been concerned with re- 
gression analysis and least squares procedures but the programs de- 
veloped for these purposes have found other applications as well. 


5.1 The moment matrix 


The first problem to be solved was that of finding a suitable method 
for computing the moment matrix of a number of variables. The prin- 
cipal problem that arises here is one of size, as storing all the numerical 
information at one time may well exceed the capacity of the memory. 
A convenient solution is to arrange operations so that the program or 
the numerical information (or both) need only be taken in as they are 
wanted. The computation of a moment matrix is equivalent to multi- 
plying the matrix of observations by its transpose, and for this it is 
necessary to take into the store only one row of the matrix at a time 
then (a) form the cross-products of the elements,’ (b) add them to the 
previous partial sums of corresponding cross-products and finally, (c) 
proceed to the next row of observations. Phe size of the store then sets 
no limit to the number of observations that can be taken into account 
but the number of variables may not exceed 25 with the present size of 
the Edsac store. 

The foregoing is a very condensed description of the simplest statis- 
tical sub-routine in the Edsac library; there is also a slightly more 
complicated routine which gives weights to observations as is required 
if they are derived from grouped data. In the case of weighted regres- 
sions the gain in time is particularly impressive because most desk 
calculators are not well'suited for the accumulation of triple products. 
It takes about 7 minutes on the Edsac to compute all the 55 weighted 
Sums of squares and cross-products of 10 variables with 40 observations 
in addition to about 4 hours for punching and checking the number 
tape and verifying the results by a sum-check. A human computer with 
an electric desk machine would probably need about 75 hours for this 
Job, so that 71 hours of labor are replaced by 7 minutes of machine time. 


52 The Inversion of the Moment Matrix 


The next step in regression analysis is the inversion of the moment 
matrix and requires a much more complicated routine; such a routine 


СШ there are & variables only &(k-+1)/2 orose-prodiucts need be formed as the moment matrix ie 
Tie, ў 


sym 
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applies, of course, to all symmetrie positive-definite matrices, Among 
the various methods of inverting matrices one had to be chosen which 
(а) could be split into successive stages so as to save Storage space and 
(b) avoided divisions as much as possible so as to save time, We se- 
lected the so-called Choleski method? which involves converting the 
original matrix into the product of a triangular matrix and its trans- 
pose, inverting the lower triangular matrix and multiplying the inverse 
triangular by its transpose to give the inverse of the original matrix, 
The three phases can be programmed separately and do not require 
the storing of any intermediate results; at each stage only k(k+1)/2 
numbers ( being the order of the matrix) have to be in the memory. 
The only non-linear operation is the taking of k inverse square roots, 
which is done by a slight modification of the square root routine dis- 
cussed in Section 2 above. By operating in floating point form the ac- 
curacy of the inverse is kept at about 7 significant decimal figures. The 
routine works at a very satisfactory speed; for instance it takes about 
five minutes to invert an 11 X 11 matrix, and as much again to re-invert 
the result as а check. Only one half of this time is spent in actual com- 
putation; the remainder is used for reading the program and the original 
matrix, and for printing the inverse. At present the Edsac can deal 
with matrices up to the eighteenth order. 

In the case of matrices of orders less than thirteen it is not necessary 
to split up the program for input purposes and reading time can be 
saved by leaving it in the memory in its entirety. This makes it possible 
to invert several matrices in succession without interruptions for pro- 
gram input. In this way twenty 5X5 matrices could be inverted and re- 
inverted in 30 minutes, that is to say, 45 seconds for each inversion. 


5.8 Special Routines for Linear Regression 

By combining these routines most of the computations in regression 
analysis can be performed automatically. For particular problems it 8 
on occasion worth taking a further step. Thus for its work on family 
budgets? the Department of Applied Economics in Cambridge has de- 
veloped a number of programs which derive the parameters of different 
types of Engel curves (with their standard errors) directly from the 
basic data. The development of these programs is economical since the 
main burden of the work is to find the regressions of about 150 varia- 
bles—the expenditures on various commodities—on a few “fixed” vari- 
ables—household size and income, and the like. The economical soli- 


8 See Fox and others [5], or more conveniently Dwyer [3] p. 196. 
? See Houthakker [8]. 


ELECTRONIC COMPUTATION IN ECONOMIC STATISTICS 425 


tion in this case is to keep the values of the “fixed” variables in the 
store and then read in the other variables one at a time. A further ra- 
tionalization might be achieved if the Edsac could take its numerical 
input from standard punched cards instead of from a specially punched 
paper tape, but this facility has not yet been arranged for the Edsac. 


5.4 Non-Linear Regression 

The value of electronic computation for statistical research is shown 
even more clearly in the case of calculations that would not be at- 
tempted without its aid. Least-squares regression analysis has custom- 
marily been confined to formulae where the parameters (possibly after 
some transformation) enter linearly, so that the normal equations are 
linear and can be solved by classical methods. Although this approach 
isno doubt satisfactory in most investigations it has proved too restric- 
tive for some special problems particularly the estimation of “unit-con- 
sumer scales" in the analysis of family expenditure. The equations 
used here are of the type 


(1) y=atblog Х citi 
i=l 
and $ 
(2) y = У) сач(а + bta) 
i=l 
where y, zi, - - - , 2,44 are observed variables and a, b, C1, * * * , Cn are 


to be estimated (one of the c; is put equal to unity). With the aid of the 
Edsac this problem has been attacked by an iterative method, that is 
the parameters are adjusted so as to minimize the residual sum. of 
Squares directly without having recourse to the normal equations 
Which result from the usual minimization procedure. The main feature 
of the method is that it is necessary to guess an initial approximation 
to the correct value and then adjust it upwards or downwards by suc- 
cessively decreasing intervals so that it converges to a value which 
minimizes the residual sum of squares. This procedure is practicable 
for the above equations since they can be transformed so that this 
Process of adjustment is required for no more than two of the parame- 
ters, the remainder being estimated in the usual way. 


55 The Simultaneous Equations Approach to Econometric Models 


vus potentially fruitful field of application for high-speed computers 
1s the simultaneous equations approach to regression analysis! which is 


1 
5 de Houthakker [8] esp. Section 5.4.4. 
ее Koopmans and others [10] Section 4. 
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the most recent development in the technique of econometrics. Of the 
two methods of estimation used, the “information preserving” or “full 
maximum likelihood” method is in practice too laborious for non-elec- 
tronic equipment, and even the “limited information” or “reduced 
form” approach” requires much more computation than classical re- 
gression. No specific programs on this subject have been worked out, 
but the programs outlined above, especially that for the inversion of 
matrices, have been found useful іп a number of applications. 


5.6 Other Statistical Applications 


Finally we mention two further fields of application. These are, first 
the recent developments in input-output analysis the central problem 
of which is the inversion of matrices of large order whose elements are 
non-negative, the diagonal elements being relatively heavy. So far the 
method adopted has been to invert the symmetric matrix A'A, where 
A is the (non-symmetric) matrix whose inverse is required, and then 
postmultiply the inverse by A’; thus 


(4'A4)74' = (АНА A! 
ад 


The second field is that known аз the Monte-Carlo method which has 
received considerable attention in the United States. In this, numbers 
generated by random sampling from an appropriate probability distri- 
bution are used to evaluate a function from which a solution may be 
obtained which converges to the true solution. Reference should be 
made to [12] for an account of this method. 1 


6. SUMMARY 


In Section 5 of this paper we have given some examples of the way 
in which most of the arduous and mechanical portions of a piece 0 
econometric analysis can, with the aid of a relatively small number of 
programs, be reduced to purely automatic processes suitable for elec- 
tronic computation. By these means the time interval between the 
conception of a hypothesis and its testing against observational date 
сап be substantially reduced, and the research worker can devote # 
greater proportion of his time to the problems of interpretation and 5 
formulation of concepts. Thus in econometrics it is frequently difio 


1 See Anderson & Rubin [1]. 
и See М. R. Fisher [4]. electroni 
и For an alternative method of inverting these matrices which is also well suited for 
computation, see Waugh [13] or more generally the paper by Leontief [11]. 


ELECTRONIC COMPUTATION IN ECONOMIC STATISTICS 427 


{о decide in advance which variables should be included in a regression: 
analysis, or if the nature of the variables is known what lags should be 
introduced. If an electronie computer is available, various specifica- 
tions can be tried out and a selection made on the basis of the results, 
Further, the restriction of linear regression equations can be overcome 
without too much difficulty. 

. In spite of the superiority in speed and flexibility of the large elec- 
tronic machines over conventional punched card machines, it seems at 
present that for the storage and summarization of a large amount of 
data such as are obtained from budgetary surveys, the latter machines 
may still be preferred. In such a case the existence of a punched card 
to paper tape converter will minimize the difficulty of transferring the 
summarized information to the high-speed machine for further analysis. 

At present the use of high speed computers in statistics is still re- 
stricted by the small number of available machines and the novelty of 
the techniques necessary to operate them. In these circumstances the 
statistician who is fortunate to gain access to a machine will prefer to 
devote most of the time he can spare for programming to the con- 
struction of programs and sub-routines with a wide validity. The for- 
mulation of statistical problems in terms of operations in matrix algebra 
is particularly helpful since modern high-speed machines are admirably 
suited to the multiplication and inversion of matrices of moderate size. 

We have, finally, to express our thanks to Dr. M. V. Wilkes of the 
Cambridge University Mathematical Laboratory for granting us ac- 
cess to the Edsac and to the many members of the.Staff of the Labora- 
tory for their help and cooperation. 
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THE ELEMENTS OF AN INDUSTRIAL 
CLASSIFICATION POLICY* 


Watr В. Simmons 
U. S. Bureau of Labor Statistics 


N A recent publication, the Bureau of Labor Statistics reported that 
I 242,000 workers were employed in the General Industrial Machinery 
industry in the United States; that employment in the industry, down 
one percent over the previous month had increased six percent over the 
same month a year ago. In another release the Census Bureau shows 
retail sales of 817 million dollars for one month in the Apparel indus- 
try. That release states further: “Among apparel stores, which as a 
group showed no change in June 1952 compared with June 1951, 
women’s ready-to-wear stores showed sales up 3 percent, while men’s 
and boy’s clothing stores showed a decrease of 3 percent.” Similar 
statements appear daily in the publications of statistical agencies. In 
the literal sense, “What is the meaning of these statistics about indus- 
trial and commercial activities?” 

A direct answer to the question is that these statistics are the end 
Product of particular surveys conducted under particular sets of con- 
ditions and particular procedures. That answer is precise and correct. 
It carries, however, some of the flavor of the remark of the villager who, 
when asked why the railway depot was so far from the town square re- 
plied, “Because that’s where the trains always stop.” 

Now a statistical survey is subject to many hazards. Dr. Deming 
has listed 19 sources of error in one compilation. Others have added 
to that list. The last decade has seen real progress in identifying, iso- 
lating, and measuring the effect of such hazards as error and variability 
of reply to questionnaire, non-response bias, processing error, and sam- 
pling variability. Every. step forward in this direction enriches the an- 
Swer to the question, “What is the meaning of the data?” 

Some of us who spend much of our time analyzing statistics from the 
Methodological and the procedural side are of the opinion that one of 
the most significant sources of potential error or ambiguity, especially 
Ш data reported by business establishments, is to be found in the ?n- 
dustrial classification of those data. It is not my intention to argue 
that classification is necessarily the greatest of all survey hazards—in- 
deed this is not the occasion for ranking the components of survey er- 


* А paper presented i tistics Classified by Industry,” 
аза part of the program on, "The Meaning of Statistics 42 
at the annual meeting of the American Statistical Association, Chicago, Illinois, December 29, 1952. 
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tor. But before undertaking a discussion of classification policy and 
its impact on statistics, I should like to offer some evidence which testi- 
fies to the high respect to which industrial classification is entitled asa 
builder of statistics, both good and bad. 

1. Consider the statistic to which I referred a moment ago concem- 
ing employment in the General Industrial Machinery industry. If the 
definition of that industry had included agricultural machinery—a not 
unreasonable possibility—employment for the industry would have 
been 427,000 instead of 242,000; a 76 percent larger figure. Incidentally, 
the change over the year would have been one percent rather than the 
six percent which was noted. Similar situations are common for statis- 
tics on production, sales, or other items. 

2, Suppose one plant in the machinery industries employed approxi- 
mately 25,000 workers, With the present development and widespread 
Knowledge of what is meant by the term “number of workers em- 
ployed,” it is most unlikely that any reporting or processing error in- 
volving these data would exceed ten percent of the true total, and much 
more likely that any deviation from the true figure would be less than 
1 percent of the plant total or less than one tenth of one percent of the 
grand total for the General Machinery Industry. On the other hand, the 
decision to classify this plant into the industry, or to classify it into 
another affects the General Industrial Machinery totals by a full 10 
percent. : 

3. More than a year ago, an Interagency Committee, under the 
chairmanship of the Federal Bureau of the Budget, was formed for the 
purpose of analyzing and coordinating the several employment figures 
which then existed. A very considerable number of man-hours has been 
devoted to that task—and I am pleased to say that real progress has 
been made. No cost records of this reconciliation task are available, but 
it is certainly being on the conservative side to state that it has been 
necessary to devote more than 90 percent of the total effort to differ- 
ences arising from industrial classification policy and practice. 

Without further belaboring this matter of the transcending influence 
of industrial classification on establishment statistics, I shall proceed 
with an analysis of classification policy. 


Purpose and Objective of I ndustry Classification 

Industry Classification has two fundamental objectives and perhaps 
five essential Supporting specifications, 
A Tool for the Management of Data. The primary purpose of Indus- 
trial Classification is to provide a system for organizing data into under- 
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— standable, manageable blocks of information. It is a procedure for sim- 
3 plifying the collection, processing, analysis, and presentation of data. 
— At every step of a survey, beginning with the formulation of goal, and 
| extending through the assembly of a mailing list, the design of ques- 
tionnaire and of sample if there be one, the collection, verification, and 
— tabulation of data, and reaching finally the analysis of findings and’ 
- their publication, we find that classification of materials into mean- 
_ ingful, convenient industrial categories is an all but absolutely neces- 
- sary aid. 
The Class Definition. In a somewhat different sense, an equally im- 
portant purpose of industrial classification is to establish the boundary 
_ of the category to which given statistics on industrial activities relate, 
_ and to define the content of that category. In its simplest terms, the 
| classification must say for every recognized category, what is included, 
| what excluded. Recall the statistics mentioned earlier on sales by Ap- 
_ parel Stores. The classification must define an Apparel Store. It must 
answer such questions as, Are the Clothing Sections of large depart- 
ment stores included? Are shoes apparel? If yes, does the answer extend 
_ to shoe repair shops? How about costume jewelry? Does the term in- 
< clude wholesale outlets? Does it encompass second-hand stores? Does 
it include importers? Is a tailor shop an apparel store? Does the cate- 
_ gory include all or a part of the store which sells both dry goods and 
apparel, or groceries and apparel? How are the Naval Clothing Factory 
and Naval Small Stores treated? We could continue for a good while. 
— The definitional job is not endless, but neither is it a light task. 
These are the prime purposes of industrial classification: To provide 
| à tool for the management of data, and to define the category to which 
Statistics about which we may.be talking relate. The discipline fails 
ts objective, however, uhless it satisfies five additional specifications. 
The first of these already has been implied: it is that the categories 
“Which are defined must be meaningful in the judgment of users of the 
data, Fulfillment of this condition obviously is not unique, in view of 
е widely different specific desires of different users, and the subjective 
“Nature of the concept. Nevertheless, it is clear that some formulations 
of classes would result in nonsense categories, while others would con- 
‘form to the opinions of large numbers of persons as to what categories 
| 8г6 appropriate, 
| The second specification is that the classes must be formed in such a 
Manner that basic data for those classes either are readily available or 
бап be made во at tolerable cost. A corollary of this proposition says 
hat the distinctions which separate one class from another must be 


. At 
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readily understandable and capable of easy application. Although avail- 
ability is relative, availability in terms of cost is a real thing and can be 
weighed. For data reported by industrial and commercial businesses, 
availability usually, although not always, means existence in written 
records of the company. Often it means existence in a particular type 
of record. For example, it would be desirable to classify all activity of 
State and local governments into each of the industrial activities which 
are recognized for private enterprise. The day may come when our 
society considers this action worth the cost. But today the records nec- 
essary for reporting such information simply do not exist. The necessity 
for existence of information in a particular type of record is illustrated 
in the printing industry. It would be desirable to have statistics on in- 
ventory, capital expenditure, and earnings separately for all job print- 
ing and for newspapers. In the United States there are many plants 
which are combined newspapers and job shops. The great majority of 
these have sales records which distinguish between the job work and 
newspaper; many have cost records which permit computation of price 
estimates for customers; but very few can separate the two activities 
in their payroll records, nor in their stock-on-hand or capital equipment 
accounts. Therefore it is realistic to recognize combined newspapers 
and job shops as a single indivisible industrial category. 

A third essential specification for an industrial classification and pol- 
ісу is that they foster continuity in statistics over time. The classifica- 
tion must have at least two characteristics in order to accomplish this 
function. It should seek to identify classes of business establishments 
which are relatively stable, and it should change the definition of а 
class only when there are compelling reasons for doing so. I would sug- 
gest that a business establishment is stable with respect to classifica- 
tion if normally it remains in the same classitication for at least a 12- 
month period. For example, it may be appropriate to distinguish be- 
tween factories which manufacture footwear, and those which make 
luggage. It would be unwise to attempt to distinguish between fac- 
tories which make shoes with leather heels and soles and those which 
make shoes with rubber heels or soles, since the same factory typically 
does both on different days, or even on the same day. Clearly a statis- 
tical series suffers a break in continuity of greater or lesser severity 
each time the class it measures is tedefined. Consideration at this point 
and at many others in the field of practical classification must be given 
to the decisions of yesterday as well as the evidence of today. 

Good industrial classification must also promote comparability of 
statistics. Comparability is a blanketing concept which I do not pro 
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pose to treat in detail in this paper. Let me merely exhibit some of its 
facets so that we may have a feeling for this aspect of classification 
policy. Comparability requires that the collecting agencies, the re- 
porting companies, and statisticians and users in general interpret and 
execute classification in a common way. It requires definitions and prac- 
tices which apply alike to all the principal statistics which are classi- 
fied—to such items as employment, wages, hours of work, production, 
sales, inventory, capital expenditure, capacity, claims for benefits, ma- 
terials used, placement of workers on jobs, and taxes paid. Comparabil- 
ity also connotes balance, in the sense that both in the structure of a 
classification and in its use, there should exist a tendency to give simi- 
lar attention to activities of similar importance. It would, for example, 
be inappropriate to build a code which gave equal importance in the 
United States to the manufacture of transportation equipment and the 
manufacture of umbrellas. This of course does not mean that the manu- 
facture of umbrellas might not be recognized as a sub-class. 

Finally, the classification must be exhaustive in the sense that every 
activity encountered in the economic world must be classifiable into 
опе or another of the defined categories. Preferably all miscellaneous 
or “not elsewhere classified” groups shouldbe kept small. 

We have looked at the over-all purpose and objective of a classifica- 
tion and policy, and at the leading additional specifications which must 
be met. I’d like now to come to closer grips with a specific problem: to 
identify the components of a classification policy, to determine what 
decisions must be made in order that a classification policy shall exist. 
The advantage of this course must be nearly self-evident. A problem 
€an be solved only when the problem itself is understood. In a classifi- 
Cation program there is great danger that day-to-day horseback rulings 
nade without reference to a rulebook and related principles would lead 
to confusion, ambiguity, and inconsistency in statistics. It is to this 
identification and formulation process that I have given the title, “The 
elements of a classification policy.” 


THE ELEMENTS OF AN INDUSTRIAL CLASSIFICATION POLICY 


1 * The Formal System. Тһе largest single component of policy is the 
List of Categories which constitute the classification. The List should 
Include not, only the titles of each recognized category, but definitions 
of those categories, a statement of the principles upon which the classes 
Were formed, and a coding system which permits easy identification of 
Categories and to some degree relates individual categories to one an- 
other and to the whole industrial economy. I shall say little regarding 
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the List for two reasons: (a) It is such a very extensive topic that I 
could not do it justice in the space that is available, and (b) This part 
of the subject has received much more attention over the past two 
decades than some of the other elements, and, I think, has been com- 
petently handled. In fact this Association only last year elected to fel- 
lowship, V. S. Kolesnikoff of the U. S. Bureau of the Budget for excel- 
lence of work on industrial classification standards. I do not wish to 
pass this critical component, however, without taking note of several 
cardinal points. The first is that The Standard Industrial Classification 
Manual, developed through interageney committees of the Federal 
government in cooperation with trade associations, labor unions, re- 
search groups, and other organizations is a List which is widely used 
by the Federal agencies and to an increasing extent by other bodies, I 
should like to urge all persons who are able to do so to extend its use, 
and to take an active part in bringing about further improvements in 


the SIC manual. The Economic and Social Council of the United Na- ` 


tions has adopted a Standard Industrial Classification of all Economic 
Activities, which is similar to the SIC, and has recommended its use to 
all member nations. 

Both the UN classification and the SIC, it should be made clear, are 
ү унш. by industries, and not by occupations, or by commodi- 

les. 

Another difficult-to-apply, but basic and pervasive principle of the 
SIC is that the classification must conform to the existing structure of 
American industry. 

Ав we moye to consideration of the other elements of classification 
policy we shall note that the List and the other elements are not en- 

j dnd independent of one another ; there is interaction among the ele- 
ments. > 

8. The Mode of Classification. The second element of policy is inti- 
mately related to the List but is worthy of special note because it bears 
80 Torcefully on still other elements. This is the choice of mode of classi- 
fication; i.e., selection of the leading concept which is to guide us in 
characterizing a group of activities as an industry. The dominant view 
із that the primary objective should be to create classes which tend 10 
be homogeneous in their response to economic stimuli, and that the 
product or service which is brought into the market is generally best for 
this purpose. The name “nature of business activity” is given to this 
concept. Other choices could have been made; for example, the primary. 
measurement might be in terms of materials used, type of capital in- 

,Vestment, nature of ownership or corporate strueture, size of organiza- 
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tion, technology, or work-force requirements. Each of these influences 
does have an impact on the industrial strueture of American industry. 
Purposes can be found for which each is better suited as а yardstick 
than any of the others. АП have in fact a role in the SIC. But it is ^na- 
ture of business activity”; i.e., product or service brought into the mar- 
ket that gets first consideration. ! 

3. The Unit to be Classified. The third element of policy is perhaps 
the most difficult to resolve. For what unit shall data be reported? We 
seek resolution through a blend of the purpose to which the data will 
be put, and the specification that required information of respondents 
must be available. At least four concepts deserve consideration. The 
broadest is а unit outlined by the span of financial control. Although 
such a unit has relationship to economie power, it is very difficult to 
identify in practice and is too heterogeneous for most purposes. 

The enterprise, or legal entity—corporation, partnership, individual 
doing business as such, or a cooperative association—is a good choice 
from several points of view. It is useful in financial matters, it is related 
to economic power, it is perhaps the least ambiguous unit in the sense 
that it is determinable in practice. But companies cross many industry 
lines, and also State lines, thereby being subject to different laws, regu- 
lations and influences. They also are too heterogeneous for most pur- 
Poses. It seems we must look for smaller units which are engaged in rela- 
tively specific, preferably single activities. This notion suggests that the 
unit might be the Department which is engaged either in direct produc- 
tion of a commodity or service or possibly in an ancillary activity such 
as the power plant of a factory. With this choice we secure a unit which 
has a relatively high degree of homogeneity with respect to nature of 
business activity and among’ such, units there probably tends to be 
homogeneity with respect to many of the statistics in which users are 
most interested. Unfortunately, we face new difficulties along this road. 
The boundaries of a Department are not always easy to locate inas- 
much as companies are organized in a variety of ways. There are both 
theoretical and operational difficulties in distinguishing between direct _ 
and ancillary activities, Finally, and conclusively, in very large num- 
bers of situations, the desired statistics simply are not recorded or main- 
tained on a, Departmental basis and cannot be reported in that manner. 

The most suitable unit seems to be the smallest unit for which it is 
Possible to provide all the information normally sought in statistical 
surveys, It appears further that this unit must lie between the com- 
Pany and the department. The unit which most of us accept is the “es- 
tablishment.” An establishment is usually defined as a single physical 


Й » 
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location where business is conducted or where services or industrial 
operations are performed; for example, a factory, mill, store, mine, or 
farm. Under certain circumstances, if the single location is comprised 
of two or more Departments for which separate payroll and inventory 
records are maintained and which are engaged in separate and distinct, 
activities, each Department may be considered an establishment. 

4. Multi-activity Locations. If there were complete agreement on just 
what constitutes an identifiable ^nature of business activity," and if at 
each location only one such activity were performed, our task of clas- 
sification would be an easier one. We should not then find it necessary 
to explore the question of how records were kept, for the location would 
be a single establishment. But there are locations at which more than 
one product or service are brought into being. Three situations arise. 
In the first, there is agreement that the products or services represent 
more than one activity for which separate industrial categories are de- 
sirable, and these activities are such that the necessary records are 
maintained separately for them. In this case, the action is clear: sepa- 
rate categories are established in the List, and each activity (Depart- 
ment) is treated as an establishment. The second case is identical with 
the first, except that the records, while not initially available, can 
through suitable action be created. The third case is the one in which 
either it seems undesirable to separate the activities, or if that be de- 
sirable, it is too costly to produce the necessary records. 

For this third case another question must be answered. What further 
measurement should be used to classify this multi-activity establish- 
ment into a single category? There is little quarrel with the rule, “Clas- 
sify the establishment according to its principal activity, disregarding 
for this purpose all other activities.”, There is not unanimity of opinion, 
though, on the proper procedure for determiriing “principal activity.” 
Without discussing the pros and cons of several possible alternatives, 
let me say that my preference, following the notion of response to eco- 
nomic stimuli, mentioned earlier, is to weigh the different activities by 
the amount of income produced. That which is greatest by this meas- 
ure is the principal activity. In operations, because value of sales is 
usually à good approximation for income produced (for products 07 
Services at the same stage of production), and because sales figures ате 
usually available whereas amount of income produced is not, I would 
use value of sales in selecting the principal activity. 

б. Length of Time Interval. After the unit, mode, and manner of 
classification have been agreed upon, the next question faced is the 
length of time interval on which the classification of an establishment 
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should be based. This determination might be given by the answer to 
the question, ^What's the establishment doing now?" Or it might be 
made on the basis of activity for a week, month, year, or other period. 
For the majority of establishments, the answer will be the same for any 
interval up to а year. But for some it will not. Since there are estab- 
lishments which change activity at very frequent intervals, we would 
not choose a time interval so short that the classification of these firms 
was highly unstable. In fact, we have set stability of classification as & 
desirable feature in accepting the specifications of continuity and com- 
parability. These considerations, plus the fact that seasonality has an 
annual cycle, strongly suggest that classification be based on a 12- 
month period of activity. 

“6. Time Lag or Lead. Closely associated with the length of period 
on which classification is based is the relationship between that period 
of reference for code determination and the period to which the data 
collected and so coded refer. This is largely an operational administra- 
_ tive problem, but it is also a policy matter. For a historical survey of 
the type of the quinquennial censuses, the answer is fairly clear: these 
surveys normally cover a one-year period, all in the past; data for the 
one-year period are classified according to,nature of business for that 
same 12-month period. For a current survey of the type of the BLS 
Monthly Employment Statistics series the proper solution is not so 
immediately apparent. Consider the situation, say, in March 1953. In 
à monthly series, the estimates will be classified and published from 
Опе to two years before the Census-type codes for the year 1953 will 
become known. Several courses are possible, some involving predic- 
tion of activity in the future. The most common practice is perhaps to 
use activity in the previous yéar as the basis for classification. This dif- 
ference in timing is one*of the most troublesome features in securing 
and maintaining comparability among different sets of data. No thor- 
oughly satisfactory solution is known to me. 

?. F requency of Review. Another dimension of the timing problem— 
От perhaps it is only another way of looking at the lead or lag charae- 
teristic—is the frequency with which the classification of an establish- 
Ment should be reviewed, and changed if the activity of the establish- 
ment has changed. With this question let us look simultaneously at 
Still another element which interacts with the timing element. 

8. The Effect of Previous Classification Upon Current Classification. 

Should the current classification of an establishment be independent 
of а previous classification? This question brings us face to face with 
Perhaps the most vexing and controversial sector of the entire subject. 
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Consider this example: In 1950, sales of the West Dakota Corporation 
(а single establishment) came 55 per cent from manufacture of electric 
motors and 45 per cent from aircraft parts. Let's say total sales were | 
$100,000. We review the sales again the next year and find the total : 
unchanged, but for 1951, activities are reversed in volume and now it | 
is aircraft parts that account for 55 per cent of the total. Using the | 
principles we have established and data for 1950, the West Dakota | 
plant and all its $100,000 sales are classified for 1950 into the electric 
motors industry. If classification is based on 1951 data, and is inde- a 
pendent of the earlier coding, statistics for 1951 will show all of the - 
plant in the aircraft parts industry. If we are interested exclusively in 
estimates of level, the best decision we can make is that just implied: 
‚ credit motors with $100,000 of activity in 1950, and none in 1951; | 
credit aircraft parts with none in 1950, and $100,000 іп 1951. This is | 
the course advocated by exponents of “current” classification, although 
some advocates of the procedure would review the coding at more fre- | 
quent than annual intervals. 1 
In reality the output of this establishment has contributed $55,000 | 
; to motors in 1950 and $45,000 in 1951; and has contributed $45,000 | 
to the aircraft industry in 1950 and $55,000 in 1951. Current classifica- 
tion, as just defined, certainly does violence to statistics on trend, and 
to our concepts of comparability and continuity. Continuity can be | 
maintained and trend reflected in a more nearly accurate manner if the | 
1950 classification of the establishment is retained in 1951. The prac- 
tice of keeping the coding of an establishment unchanged from one | 
period of time to another constitutes a policy of fixed classification. It — 
too, has weaknesses. Even if an establishment changes its activity | 
slowly, the initial classification may become entirely unrealistic after | 
a long enough interval. If the establishment’ changes rapidly, or dis- 
continues one activity and enters another, the fixed classification be- 
comes misleading in a short while. y 
Is there a way of reconciling these policies, some method which ap- ! 
proaches current classification in producing accurate levels in statis- | 
ties, but still retains to a degree the advantages of fixed classification | 
in maintaining comparability and continuity? The answer is yes. There | 
are several methods. All perhaps, can be termed current classification 
Schemes modified by a resistance or reluctance principle. The essential | 
feature of these schemes is that the current classification replaces the | 
previous classification, provided the activity pattern has shifted by 20 
amount In excess of some standard tolerance; otherwise the previous 
classification remains fixed. With suitable side conditions, certain opti- | 
mum determinations of tolerance can be made. A good many person = 
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and agencies over a number of years have employed some form of re- 
luctance in coding, but the first formal treatment of the concept known | 
to me is in an unpublished memorandum written by Jack L. Ogus of the 
0. S. Census Bureau. 

А classification policy must then include decisions on whether to use 
fixed or current classification, or some specific resistance technique, and 
on the frequency of review of activity. 


ж ж ж 


Му discussion has recognized eight major components or elements 
of an industrial classification policy. Firm decisions on each of these 
elements are essential to a sound policy. Fortunately, when they have 
been made, one has gone a long way through the planning stages of a 
successful program. It should be added that the program will be more 
cohesive and will run more smoothly if these basic decisions are aug- 
mented by a set of written working rules which cover what might be 
termed pseudo-policy matters, many of which may be peculiar to the 
particular program. I shall make no attempt to enumerate these mat- 
ters, but will illustrate with a few examples: 

(a) Rules for classifying workers and activities which are not local- 

ized geographically. 

(b) Precise instructions for distinguishing between ancillary or auxil- 
iary activities which are included with the parent establishment, 
and those which are treated as separate establishments; e.g. ac- 
counting offices are always included with the establishment 
which they serve. В 

(c) Definite procedure for adjusting for such coding errors as may be 
discovered. uei. 

(d) Mechanical arrangements such as assignment of an identifying 
number (not name) to each establishment. 

(e) What types of interplant transfers should be considered sales? 


ж ж ж 


In conclusion, I should like to stress these points: (1) Industrial 
Classification is а many-sided methodology; explicit decisions and ac- 
Чоп must, and can, be taken with respect to its major elements. (2) 
Industrial classification is an approximating technique; % does not 
always give the fineness of detail that we might prefer, but because it 
chooses as building blocks the establishments, for which many records 
exist, it yields a wealth of information that perhaps no other selector 
Bs match. And finally, (3) the importance of the subject to economic 
Statistics is difficult to overstate. A \ 


EXPERIMENTAL DESIGNS AND PROBABILITY 
SAMPLING IN MARKETING RESEARCH* 
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GENERAL CONSIDERATIONS IN MARKETING RESEARCH 


N GENERAL, the problems of marketing research center around the 
I companion objectives of market development and physical operat- 
ing efficiency. Much of the market development for a particular prod- 
uct depends on ability to determine the economie wants of both actual 
and potential consumers. The marketing system operates in an imper- 
fect way in bringing about practices and services most acceptable. In 
the large, the system is so constructed that products are offered to the 
public on a “take-it-or-leave-it” basis with adjustments made by expe- 
rience in a slow and cumbersome way. A study of these imperfections 
in the system constitutes the most important problems of marketing re- 
search. 

А consumer’s decision to buy or not to buy is based on a multitude of 
motivations varying all the way from fickle whims to thorough study 
of value received per dollar spent. Small wonder then that the crude, 
unscientific. observations of producers and merchants lead to uneco- 
nomic marketing practices which fail to satisfy the consumer and cost 
the producer and merchant vast sums in lost sales. The problem re- 
solves itself to one of measuring variables believed to be associated with 
volume of consumer purchases. 


There are two distinct and conventional avenues of attack on such 
problems: д 


(i) The problems may be studied under controlled or laboratory conditions 
using experimental designs. 


(ii) The problems may be studied under uncontrolled or actual conditions 
using sample surveys. 


Using the experimental method the researcher must describe and con- 
trol the conditions under which the effects are produced. Variables not 
kept constant must be measured and eliminated statistically. The data 
gathered with the survey method are the everyday experiences of the 
Populations under study. Elimination of the effect of non-test variables 
18 attempted by stratification in sampling and by statistical analys 
after the data are gathered. Assuming that this can be done the latter 


* Presented at tho American Statistical Association Meetings in Chicago, December 27, 1952. 
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approach is restricted in that innovation cannot be tested. This is a 
serious restriction for market development per se implies innovation. 
Any satisfactory method must meet two major requirements if re- 
sults are to have utility; these are: 
(i) The method must permit relatively satisfactory means of isolating the 
effects of specific variables. 


(ii) The effects of specified variables must be measured under conditions es- 
sentially the same as those found under actual conditions. 


Once these requirements are met the selection of procedure is largely 
one of cost consideration per unit of information. 

Of paramount importance to the successful solution of a marketing 
research problem is a thorough understanding of the principles in- 
volved and of the nature of the tools employed whether they be sta- 
tistical or otherwise. Thus, it may be necessary to employ a team of 
scientists to effect practical solutions. The statistician advisedly may 
be a member of such a team. 

Certainly the researcher must keep in mind that solutions are noth- 
ing more than a stage in development. In this sense solutions to mar- 
keting problems are sought only in terms of improvement over existing 
practices. The theoretical potential of masket development is always 
beyond grasp with the area between present practice and theoretical 
Perfection always offering a fertile field for research. 


SOME EXPERIMENTS IN MARKETING RESEARCH 


The remainder of the discussion will be devoted to some illustrations 
of research designed to measure consumer wants for one product, ap- 
ples. The coordinated sequence of projects to be described were under- 
taken at the request of apple growers who at the outset were of the 
pinion that quality of product was one of the most serious factors im- 
Peding apple sales. Af ter considerable deliberation it became apparent 
that the industry was more concerned with bruising than any other 
quality problem. 


Studies on Bruising 
In 1948 and 1949 Van Waes undertook to determine the effect of 
bruising on consumer acceptance.! Since it was assumed that different 
degrees of bruised apples were in the market place the survey method 
theoretically would have offered a satisfactory tool. However, previous 
experience in attempting to isolate the effect of one particular variable 


a Van Waes, D. A., Economie Significance of Bruising on Retail Sales of McIntosh Apples, Ph.D. 
15, Cornell University Library, Ithaca, М. Y., 1951. 
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from a multitude of others through either stratification or analysis led 
to the use of controlled experiments in which non-test variables could 
be held constant. 

For such a test the self-service supermarket appeared to be made to | 
order for in such a store the reactions of customers could be observed 
and,measured. It would have been a relatively simple matter to run a 
series of tests in which matched lots of apples varying only as to de- 
gree of bruising were offered to buyers but in order to be able to fore- 
cast sales the tests must be conducted in an environment simulating 
actual marketing circumstances. It was not the general practice for 
stores to offer several lots of apples varying only as to bruising. Rela- 
tive sales from the various lots would not indicate what actual sales 
would be if only one of the lots were offered. Many such experiments 
using matched lots have been conducted in the past but the results are 
meaningless in predicting actual sales of either one or the other 104,2 

‘In order to simulate actual conditions it was necessary to have only 
one degree of bruising in a store at any one time, and in order to obtain 
valid comparisons among the various degrees of bruising it was neces- 
sary that they be tested under comparable conditions. Since time and 
store differences represented two major sources of variation, a design 
with two-way elimination of variation was desirable. The latin square 
design was admirably suited for this situation. In this design every 
treatment (the various degrees of bruising) appeared once in a row | 
(&he particular time interval selected) and once in a column (the store). 
The latin square design was found to be very effective in marketing 
research for controlling or measuring variations due to store and time 
differences.‘ Therefore, in order to study the effect of bruising on the 
volume of apple sales four degrees of bruising were set up with the lots 
of apples alike in all other respects. These four treatments were tested 
in three 4X4 latin squares. The columns of the three sets of 4X4 latin 
Squares were the 12 stores (one in each of 12 cities) in which the experi- 
ment was conducted. The rows were four two-week periods. 

As à companion study to the one described above a survey was made 
of randomly selected stores to determine the extent of bruising on ар- 
ples normally on the market. The sample was drawn in the same cities 


? Van Waes, D. A., “Evaluation of Research Techni i fuences of 
i ЕДЫ ques Used for Measuring the Influe! 
puse tees ted with Volume of Consumer Purchases in Retail Stores,” Methods 07 
‘ бек in Marketing, Paper No. 1, Department of Agricultural Economics, Cornell University: July 

? Fisher, R. A., The Design of Experi x 

à ^ izperiments, 5th Edition, Hafner, New York, 1949. : 

i e Jr. B. A., “An Illustration of the Use of the Latin Square in Measuring the Effective 
ness of Retail Merchandising Practices,” Methods of Research in Marketing, Paper No. 2, Departmen 
of Agricultural Economics, Cornell University, June 1952. 
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included in the controlled experiment. This was done in order that rec- 
ommendations could be made from the results of the controlled experi- 
ment in terms of the extent of bruising on apples actually in the market 
place. 

The companion studies furnished two important items of informa- 
tion: 

(i) The extent of bruising necessary to reduce the volume of apple sales. 

(ii) The extent of bruising on apples found in the market place. 


With the above information it was possible to inform growers and store 
owners that present methods of handling apples were not causing un- 
due damage as measured by the volume of apples purchased by cus- 
tomers.5 Only two per cent of the apples in the 504 sample records were 
аз badly damaged as the experimental treatment which had the most 
bruising and this treatment was the only one to which buyers responded 
through decreased purchases. Measuring the effect on sales of this two 
per cent would have been very difficult, if not impossible, if only the 


‚ sample survey data had been available. This illustrates one of the dif- 


ferences between controlled and uncontrolled experiments and how the 
two can be combined to advantage. 


Studies of Merchandising Practices 


. In the process of making the studies on bruising many varied prac- 
tices of pricing, displaying, and packaging apples were observed to- 
gether with highly varying sales rates. This raised the question of how 
these practices affected sales. To obtain information on this Dominick 
Conducted a series of experiments on these as well as innovated varia- 
bles? A series of 4X4 latin squares were used in 4 stores as columns and 
4 time periods of 1 or 4 days as rows (Figure 1). Over a period of 12 
Weeks, in the fall of 1950, 16 different merchandising practices (the 
в) were compared, and 24 individual experiments were con- 

iueted. , 

Because approximately half of the volume of grocery sales occur on 

Tiday and Saturday and because larger grocery orders per customer 
ate purchased on weekends the week was divided into two parts. The 
X part of the week consisted of the first four days. On weekends both 
Friday and Saturday were divided into two parts so that the two days 
Combined formed four time periods. Thus there were two latin squares 


thea So Max E., “Influence of Bruising on the Sale of Apples," Proceedings New York State Horti- 
ciety, 95: 73-80, 1950, is 
Dominik, Jr., В. A, Merchandising McIntosh Apples Under Controlled Conditions Customer 
Un and Efect on Sales, Ph.D. Thesis, Cornell University Library, Ithaca, New York, 1052. 
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in each week. Each set of four treatments were tested over a two-week 
period. 

Тһе treatments selected for testing іп any two-week period depended 
largely upon the results of the preceding experiments. This practice 
quickly led to innovations in the selection of treatments, care being 
taken to determine the practieability of any treatment before it was 
included in an experiment. This sequential selection of treatments, al- 
though not formalized by mathematical rule, resulted in the selection 
of 16 different merchandising practices whose sales varied from 11 to 
33 pounds of apples per 100 customers." 

The most effective treatment, an innovation, was recommended to 
the trade less than a month after the store tests were completed. Within 


First part Second part 
of week of week 
Day Store Day Store 
1121314 1 2|8 [# 
Monday В ОЕ ГА Friday a.m. вА Ср 
Tuesday АВ ор Friday р.м. C|D|B|A 
Wednesday DIA PB EC Saturday a.m. A-| B |- DIRO 
Thursday CID|A|B Saturday р.м. рі! с|А В 


Ficure 1. Diagrammatic Lay-out of Two 4X4 Latin Squares 
for Four Treatments (A, B, C, D). 


two years the treatment, though modified in some cases, was in general 
practice by the trade with over two-thirds of the apples so sold in West- 
ern New York. The widespread application of the results of the experi- 
ment led to many associated problems beyond the scope of the re- 
Search. For example, the New York State legislature promptly amended 
the grading laws to facilitate the use of this merchandising practice. 
Also new packing methods were developed on numerous farms. Me- 
chanical bagging equipment was developed and new master shipping 
containers were devised after much trial and error by the trade. 

The final test of the validity and usefulness of any research is the 
experience of actual application. As previously indicated the results of 
these experiments were in wide application very shortly after the tests 
were conducted but only isolated instances of experience are available 
ое а асады What Makes Your Apples Sell,” 


| 
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io the CINE A The recommended method of merchandising apples 
was tried over a twelve-week period in the fall of 1951 in ten stores of а 
large national chain organization.* Their sales increased 42 per cent and 
the practice was quickly extended to other stores in the chain. The re- 
sults of the controlled experiments had indicated that the apple sales 
of this organization could be expected to increase 40 per cent. Another 
large chain organization using the innovation of 1952 reported almost 
identical volume (pounds) of sales іп 1951 and 1952, but at 60 per cent 
higher retail prices. By 1953 practically all chain organizations in the 
country and thousands of independent grocers had incorporated the re- 
sults of these experiments into their merchandising practices. 

In December 1952 apple prices were more than double the prices ex- 
isting during the first tests in 1950 and some question arose concerning 
the effect this price increase might have had on the recommended mer- 
chandising practice which consisted of a combined bulk and polythene 
package display priced in 6 pound units. Consequently a latin square 
experiment was conducted comparing 2, 4 and 6 pound pricing units as 
had been done in 1950. The same stores were used for the tests. Again 
the recommended practice proved most effective in maximizing sales 
with results similar to those obtained in 1930. 


Study of Carry-over Effects 


Many of the treatments tested during 1950 also were retested by 
Henderson in 1951 under a different price situation and in 12 different 
Stores located in 12 large cities.? The conclusions, without exception, 
Were the same as those obtained in 1950 (Table 1). A new feature was 
Incorporated in the design of these experiments.!° Because the day to 
day rotation of treatments among stores created an artificial condition 
hot normally found in the market place, it was desirable to determine 
the effect of given treatments on following treatments. To do this the 
treatments were rotated among stores every week instead of every day 
and а double change-over design was used." In using the change-over 
design particular treatments must be in given stores a sufficient time 
to insure that carry-over effects stem only from immediately preceding 


Ooto, 29719, Lloyd H., “Marketing Research Results Work,” Cornell Farm Economics, 186: 4888-4889, 
october 1952, 

,. Henderson, P. L., Influence of Selected Marketing Services on Apple Sales, Ph.D. Thesis, Cornell 
Wrivenity Library, bass, саны sae aera AME edits We Ts tassi Du HU 
B. Apples, Proceedings New York Horticultural Society, 97: 24-83, 1952. 
of Dr Henderson, P. L., “Application of the Double Change-over Design to Measure Carry-over Effects 

“otments in Controlled Experiments,” Methods of Research in Marketing, Paper No. 8, Department 


of Agricultural pi 5 
conomies, Cornell University, July 1952. ^ à 
Саш Chron, W. G., Autrey, К. M., and Cannon, C. Y., “А Double Chenge-over Design for Dairy 


le Feeding Experiments," Journal of Dairy Science, 24: 937-951, 1941. 
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TABLE 1 
EFFECT OF MERCHANDISING PRACTICES ON APPLE SALES 


Beptember-December 1950 1 Beptember-December 1951 
Pounds Pounds 
Practice per 100 Practice per 100 
Customers Customers 
Promotional devices (All 4 Ib. bag Packaging material (All in 5 Ib. 
and bulk) units) 
Display without promotional de- No packages—bulk only 12 
vices 20 Red mesh bags and bulk 17 
Display marked as to variety and Paper window bags and bulk 18 
use 20 Pliofilm bags and bulk 19 
With window streamers added 20 Purple mesh bags and bulk 20 
Display doubled in size 21 Polythene bags and bulk 22 
With added window display of 
apples 25 Size of pricing unit 
Four-pound Polythene bags and 
Bulk only bulk 19 
Priced in two-pound units 1 Five-pound Polythene bags and 
Priced in four-pound units 13 bulk 22 
Six-pound Polythene bags and 
Package only bulk 27 
Four-pound Cellophane bags 18 Eight-pound Polythene bags and 
bulk 20 
Combination package and bulk dis- — 
plays t Five-pound mesh bags and bulk 20 
‘Two-pound Cellophane bags and Eight-pound mesh bags and bulk 20 
bulle 13 Ten-pound mesh bags and bulk 20 
Four-pound Cellophane bags and 
bulk 20 Location of display 
Tour-pound Polythene bags and By scales af 
bulk 23 End of counter next to no fruit 22 
Six-pound Polythene bags and End of counter next to oranges 21 
„bulk 28 End of counter next to bananas 19 
Six-pound open hi-hat baskets 
and bulk 21 


Quality and price (All 4 lb. bag and 
bulk) 


21* min, priced 25% under 23" 
Bruise-free apples 

Price reduced 35% 

Highly colored apples 


ёч 


treatments and not from earlier treatments.” It is believed that weekly 
rotations are satisfactory with most perishable foods particularly 
view of the weekly shopping habits of people. 


¥ The double change-over design consists of the k-1 orthogonal k Xk latin squares. The treatment 
are compared in various sequences. The double change-over design retains the advantages of the 188 
square in eliminating store and time effects and at the same time permits the measurement of КУЛЕ, 
кка When carry-over effects аге not present k-1 ordinary latin squares may be used instead o! 
ouble 
adjustments are made in the treatment means for the i h adjust- 
effect of the preceding treatment. 500 
ments tend to reduce the experimental error and to give unbiased comparisons of the treatment effects. 


change-over design. If carry-over effects are present and a double change-over design is used, 
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The double change-over design was found useful in measuring the 
effect of the Thanksgiving and Christmas holiday trade. In one such 
instance using two orthogonal 3X3 latin squares, carry-over effects of 
the treatment in one 3X3 latin square were the reverse of those in the 
second 3 X3 latin square. The second square was completed just prior 
to Christmas, a time when customers were buying relatively more of 
the larger packages as compared to their performance in the first 
square. Thus, the design proved useful in pointing up and detecting 
variation in buying habits at different times during the season. 


Comments on Techniques and Efficiencies 


The above discussion illustrates the application of two very useful - 
designs to marketing research experiments. Of course other designs, the 
randomized complete block, the split plot, and the lattices may be suc- 
cessfully used for studying certain marketing problems. The particular 
hature of the problem and the sources of variation will determine the 
appropriate experimental design. 

It is interesting to note that missing values for the period of observa- 
tion, or “missing plots", may and do occur in marketing research stud- 
les just as they do in other fields of research, Failure to keep records or 
lost records is only one source of omission. Sometimes unforeseen de- 
Yelopments will occur such as street repairs in front of a store over a 
Period of time. If a street is torn up in front of a store the customer 
count may decline far more than total sales because the obstruction will 
affect small sales more than the large ones. Also, fire or flood may pre- 
vent a store from operating in the accustomed manner. The analysis of 
experiments with missing observations may be handled in the usual 
manner as described by Cochran and Cox, Snedecor and others. 

To obtain an idea of the effect of stratification by time intervals and 
by stores the results of 34 experiments (Table 2) were studied. As a 
Measure of relative variation in the various experiments the coeffi- 
cient of variation was computed for each experiment. The coefficients 
of variation were higher for the 24 experiments conducted in 1950 than 
for the others. In these experiments the time interval was one day while 
in the remaining experiments the time interval was either a one or two 
Week period. Thus, one method for reducing the coefficient of variation 
1 to use time periods of one week rather than of one day. It should be 
noted here that the coefficient of variation was computed from the 
residual mean square in the latin square without covariance. 

The efficiencies of the latin square relative to randomized complete 
block designs using stores as replicates are given in column five of Ta- 
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TABLE 2 


RELATIVE VARIATION AND EFFICIENCY DUE TO STRATIFICA- 
TION OR COVARIANCE IN 34 LATIN SQUARE ANALYSES 


Efficiency relative to 
F Coeffi- Randomized Efficiency 
Біле of Experiment | cient of complete Com- using co- 
latin conducted varia- blocks using pletely variance 
panes tion as replicates random- | analysis 
-----:ББ:Б ized 
Stores | Times 
Yr. Хо. (Per cent) 
8X8 |1948 1 | 19.7 214 214 303 102* 
4x4 |1949 1 | 17.7 112 702 572 148* 
4X4 |1949 2 | 15.5 113 243 220 174 
4x4 |1949 3 7.9 146 | 1241 1013 94" 
4x4 |1950 1 | 45.1 120 143 148 132° 
4х4 |1950 2 | 30.0 341 126 306 116° 
4x4 |1950 3 | 31.1 108 141 138 83^ 
4X4 |190 4 | 25.7 210 149 222 81» 
4x4 |190 5 32.73 90 192 163 80^ 
4X4 |190 6 | 25.75 220 181 255 86^ 
4x4 |1950 7 | 37.16 124 159 164 160° 
4x4 |190 8 | 37.93 225 152 237 132» 
4x4 | 1950 9 | 40.58 98 71 76 84» 
4x4 |190 10 | 47.0 102 128 193 78° 
4X4 | 1950 11 | 31.06 184 185 230 162» 
4х4 |1950 12 | 36.72 115 150 150 98^ 
4X4 |1950 13 | 34.64 95 158 140 81 
4x4 |1950 14 | 24.23 286 125 263 114° 
4X4 |1950 15 | 38.32 152 101 141 80» 
4Х4 |1950 16 | 47.79 108 94 102 61» 
4X4 |1950 17 | 39.60 225 120 211 90° 
4X4 |1950 18 | 42.41 152 128 161 80° 
4X4 |1950 19 | 53.48 80 82 72 100^ 
4X4 |1950 20 | 21.96 135 301 282 80° 
4X4 |1950 21 | 36.24 121 284 257 80° 
4х4 |1950 22 | 27.88 116 152 153 148" 
4X4 |1950 293 | 4845 84 91 81 278" 
4X4 |1950 24 | 19.77 | 141 166 182 se 
6X6 |1951 1 | 19.19 132 226 234 112° 
6x6? |1955 1 | 16.67 101 368 328 107° 
4x4 |1951 2 6.09 72 | 3237 2492 85° 
‚ 4X4! „| 1951 2 | 22.34 182 | 3921 3102 118° 
4x42 |1951 2 6.72 283 | 9977 7838 90. 
4x4 |191 2 | 1413 | 170 | 731 693 gr 
1 Other apples. ? АП apples. 3 Oranges. 


Ж Covariance on volume of grocery and produce sales. 
Covariance on number of customers. див Balos 
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ble 2. If the time interval stratification is ignored, the average error 
variance in the 24 experiments in 1950 is 51 per cent larger; the median 
increase in efficiency of the latin square over the randomized complete 
block is 22.5 per cent. If the store stratification is ignored but the time 
period grouping is not, the average increase in efficiency of the latin 
square is 49 per cent, while the median increase is 42 per cent (column 
6, Table 2). If both the store and time interval variation are not con- 
trolled the average increase in the error mean square for the completely: 
randomized design in these same 24 experiments is 77 percent, and the 
median increase is 62 percent. The other experiments were not included 
in these averages because the period of observation was of different 
length. 

The analysis of covariance of apple sales and total number of cus- 
tomers, total grocery sales, or total produce and grocery sales was of 
limited usefulness in these studies. The removal of store and time in- 
terval differences in the latin square accounted for most of the rela- 
tionship between the covariate and volume of apple sales. The residual 
variations were not related to any extent. If the variation due to stores 
and time intervals were not removed then covariance analyses may be 
expected to decrease the error variance considerably, but not to the 
extent that the latin square did. In other studies the use of covariance, 
analyses may prove quite beneficial. 


A Sampling Program 


Having affected material improvement in the merchandising of ap- 
ples and having ascertained some of the important factors affecting 
their sales the industry was anxious to use this information to achieve 
an orderly movement of the crop into consumption. Experience from 
the previous work indicated that observations of sales coupled with 
customer counts might serve as an indicator of movement rate from 
Week to week. Rate of movement together with descriptions of store 
Practices would enable the industry to undertake remedial action as 
800n as undesirable developments occurred. Over a large number of 
Stores the movement rate could be affected by a number of factors chief 
among which are: (1) merchandising practice used, (2) proportion of 
stores handling apples, (3) relative display space devoted to apples, (4) 
Prices of apples and other fruits, and (5) quality condition of apples and 
Competing fruits, 

т though previous experience had revealed a high degree of con- 

НОСУ in the customer reactions to different selling practices among 

егеп stores, there still remained а tremendous problem of how to 
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efficiently sample stores over a wide geographic area. Published lists 
of stores were available for most towns and cities in the western half 
of New York State which was chosen for study but it was desirable to 
know something of the effects of the geographic area, size of city, size 
of store, day of week and time of day on the rates of sale. To insure the 
measurement of all these variables with a relatively restricted budget 
some form of experimental design seemed to offer definite advantages, 
The first purpose of the study was to learn more about the influence of 
the above factors on rate of sale. The second purpose was to provide 
a crude measure of movement rate from week to week for release to the 
trade until a more adequate coverage could be obtained. At the outset 
it was decided that the second purpose should be subservient to the 
first. 
Since the unit of observation in this study was the customer in the 
_ store, the question might arise as to why people were not interviewed 
in their homes or why rate of movement information was not obtained 
from weekly store inventories. Direct observation of actual customer 
performance has many advantages in avoiding memory biases, in en- 
abling enumerators to cover much larger numbers of shoppers, and in 
associating specifie merchaadising practices with shopper performance. 
Assuming that accurate store inventories could be obtained (and there 


is good reason to doubt it) there would still remain the problem of de- ! 


termining how the product was merchandised as well as shopper Te- 
sponse to such practice. 

Because the sales rate on weekends varied considerably from the first 
parts of the week, it was decided to make one visit to each selected 
store in each part of the week and during each visit take customer 
counts and sales for a one hour period. The budget limited such cover- 
age to 64 stores. The area selected was Western New York which was 
divided into four geographic areas. In each of these areas 4 sizes of cities 
were selected. Lists were prepared of all places over 100,000 population, 
20,000-100,000, 5,000-20,000 and under 5,000. 

Tt so happened that there was only one city in each area having over 
100,000 population so these were automatically selected. Random 86 
lections were then made of one city in each area from the second size 

grouping, 2 cities from the third and 4 cities in each area from those un- 
der 5,000 population. Many small cities are clustered around the large? 
ones with the shopping areas for the smaller places being in the larger 
cities. For this reason it was necessary to impose a restriction that any 


smaller city selected be at least 10 miles from a larger city. Routes Wee. 


then constructed for each area with 4 stores in each of the two large" 


| 
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sized cities, 2 each in cities of 5,000 to 20,000 and 1 each in towns under 
5,000 population. Lists in each area were used to select, Stores, half be- 
ing small and half large stores. The plan was so constructed that visits 
to any one store were made in succeeding weeks at precisely the same 
hourand day of week. Within any one two-hour period throughout the 
week one store was enumerated in each of the 4 geographic areas, in 
each of the 4 sizes of cities and half the stores were small and half were 
large. 

All combinations of the variables—geographie area, city size, day of 
week, time of day and size of store—constitute a 2X4! factorial. The 
possible combinations total 512. From these combinations the 64 given 
in Figure 2 were selected. The fractional replicate selected was con- 


Level of factor 


| 


a bcd e a bcd e abcde abcde 
00000 02:50: 22004 010% 0 122502091 
00002 02022 10003 LOD Oso Ny 
1101 2 0 1 201/020 Q0, 1 1205] Она 
1501. 292 1.9 085 0 0,12 3 0 27 0058 
102310 12 27240 010-2: 101 0:12.98 1T 
280059: 1 а ЕР 00213 07270000808 
003 3-0 0.:2/58 71220 12045859434 17/22/95 НИ 
50.8.3 2 0,,2: 8: 199 1:0: Bi 3008 31/52 8/208; 
Urt 0 1.1 08031 LELO до 1.8030 
Отот з о ЗОВ 1.03 SOD 1,3 0 872 
EMT $1 1 38 111 0^1, 28.0 PR De s hac) 
11133 1 5 А 
912021 1:8:2:0/1:57 0916292 Olean Oe ОВ 
оз 1. 3:2: бло оф МО о ЛОВ 
и 30 1- оза Е ООО 
14303 0332: Шо БИ 
а =large store т. For 
91 =small store d; Tuesday First Part 
di—Wednesday| Sf Week 
dı=Thursday 
ba cities over 100,000 
bi =cities between 20,000 and 100,000 
В =ойівв between 5,000 and 20,000 e= 8am. to =| For 
bs =cities under 5,000 бз10 м. tonoon | Pist Part т 
є: =пооп to 2 P.M. of Week 
‘в =2 P.M, to 4 P.M. 


са Buffalo area 
¢=Binghamton area 
^ "Syracuse area 

©: - Rochester area 


Figure 2. Sixty-four Treatments Used in Studying Rate of Movement. 
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structed so as not to confound main effects. The time periods within 
each week were divided to permit two complete sets of the 64 combina- 
tions—one set during slack trading hours and one during heavy trading 
hours. Thus each store in the design was enumerated twice a week and 
at precisely the same hours in succeeding weeks. 

Weekly enumerations were completed each Saturday night at 6 
o'clock. Office tabulations were made daily as the field reports were re- 
ceived so that summaries were completed by Monday noon for each 
preceding week. These completed reports on movement rate were 
mailed to the trade by Monday evening. The greatest delay in tabula- 
tion resulted from making adjustments in the non-proportional sam- 
pling which was necessitated by the experimental design. The sum- 
maries reported the rate of sale per 100 customers, quality indices and 
retail prices of each variety, size of pricing unit, a description of display 
practices as well as qualities, prices and display space of other fruit. 
Experience has shown that these factors are associated with rate of 
movement and the information proved useful to the trade in taking 
correct remedial action in maintaining the movement of apples into 
consumption consistent with storage inventories. 

Combining probability sampling with an experimental design in this 
instance served to evaluate certain variables for use in the designing of 
an improved sample for future use and at the same time permitted some 
degree of estimate of the current movement situation together with its 
associated causes. 


IMPROVING NATIONAL MARRIAGE AND 
DIVORCE STATISTICS* 


Huan CARTER 
National Office of Vital Statistics 


HE principal objectives of the program for improving our present 
marriage and divorce statistics are to provide prompt and compre- 
hensive data on marriages and divorces that occur in the United 
States and to give such details regarding the social characteristics of 
the persons involved as are needed by users of these statistics. The rate 
of formation of new families, as well as the rate of dissolution of estab- 
lished families is of interest to sociologists, economists, demographers, 
social workers, and many other professional and business groups. 
Statisticians concerned with population projections have recently 
shown an increased interest in the role of marriage data as an aid in 
forecasting births. At present, international comparisons of marriage 
statistics emphasize the incompleteness of the United States figures. 
Distribution of the population by marital status is given in Bureau 
of the Census data; for 1950 the figures are available with considerable 
detail as to social characteristics. By contrast, the registration of 
Marriages, or divorces, for 1950 provides a count of occurrences within 
the year and information concerning the social characteristics of the 
individuals at the time of registration. Since registration statistics are 
based upon legal documents, certain types of closely related events, 
Such as consensual or common law marriages, are not included in 
these periodic counts, Final decrees of absolute divorce are tabulated 
and exclude limited decrees and separations. The present paper will 
Teview the steps now being taken,to improve national marriage and 
divorce statistics and to’ indicate some of the problems involved. As 
background it will summarize the earlier efforts of the Federal Govern- 
ment in this field. Registrations occur in local communities, typically 
їй the community that is the county seat. In a majority of the States, 
4 record of marriage or divorce is transmitted to the State Registrar of 
Vital Statistics, 

Improvement of marriage and divorce statistics сап take place only 
Оп the basis of close cooperation between the federal, State and local 
agencies involved. Fortunately, such cooperation is already well ad- 
vanced. State registrars of vital statistics are accustomed to close 
Cooperation with local officials. The pattern of this cooperation has 
been hammered out over the years through other programs, such as 


* Presented to the American Statistical Association Meeting in Chicago, December 30, 1932. 
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registration of births and deaths. In most States, the State registrar 
maintains a field staff to work closely with local officials. ; 

Cooperation between federal and State agencies is greatly facilitated 
by the Public Health Conference on Records and Statistics, hereafter 
referred to as the Conference. This organization was created to provide 

` for interchange and discussion of ideas and problems relating to 
public health statistical programs and to encourage coooperative 
action by the representatives of federal, State, and other organizations 
included in the membership. Represented at the Conference are indi- 
viduals concerned with registration and health statistics activities of 
each State, Territory, and independent registration area of the United 

States. Also part of the Conference are representatives of the American 
Association of Registration Executives, the American Public Health 
Association, and the National Office of Vital Statistics of the Public 
Health Service. The Working Group on Marriage and Divorce Regis- 
tration of the Conference has for some years been preparing a compre- 
hensive federal-State program of marriage and divorce registration 
and statistics. 

Before proceeding with the discussion of plans for improving mar- 
riage and divorce statistics; it may be useful to glance at the history of 
the registration and reporting of marriages and divorces and to note 
what data are presently available on a yearly or monthly basis. It 
will be evident from this survey that the past century has witnessed 
substantial improvement in the reporting of these data. While there 
have been many serious set-backs to the program, and while much 


Decennial Census of 1850 and in several subsequent censuses, with 
admittedly “very deficient” results.'? Marriages and divorces during 
the Period 1867-1906 were compiled in two surveys based on the 
original records in county seats. During the next 15 years, except for 
1916, no national Statistics on marriages and divorces were collected; 
but beginning with data for 1922, the Bureau of the Census undertook 
an annual collection program which continued for 11 years.5 In 1928 
it published estimates for the missing years 1907-15 апа 1917-21. 
1 Population of the United й 
хх бі of the леят ofthe Uii See ore Ст 190 Office, oth Consus, 1873, 


3 Marriage and Divorce i; the Unit ё 452 
Labor, 1880 (outofprint) At States, 1867 to 1886, by Carroll D. Wright, Commissioner of 
ў ағат ала Divorces, 1867-1008, Bureau of the Census, 1908 (out of print). 


5 E maed Divorce, Annual Reports, 1922-32, Bureau of the Census, 1925-34. 
l., 1926. 
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For the years 1933-36, the best available national estimates are those 
of Stouffer and Spencer. , : 

During the 1940 Census period, the Bureau instituted marriage and 
divorce collection programs patterned after those already operating in - 
the collection of data on births and deaths. This program was short- 
lived but helped produce enough data for estimates to be made for the 
years 1937 through 1940.8 А program based on data collected by mail 
from a variety of sources was begun several years later,” and in July 
1946 the function was transferred to the Publie Health Service as an 
integral part of the National Office of Vital Statistics. 

Annual summaries of marriage and divorce statistics for the United 
States, by State, have been published for the years 1946 through 1950. 
For a substantial number of States it has been necessary to use figures 
for “marriage licenses" rather than “marriages,” and for a few States, 
where reporting was incomplete, estimates have been made. The figures 
are tabulated by the State in which the marriage or divorce occurred. 

Monthly national and State figures on marriages (licenses or mar- 
парез reported)! are obtained from 25 State offices and from local 
officials in 23 States. Other monthly figures include marriage licenses 
for each of the major cities and divorces and annulments for 19 States. 

For the specified States that can provide the data, the National 
Office of Vital Statistics also publishes detailed reports on marriage 
and divorce. This cooperative project does not include all of the States. 
The marriage report gives ages of bride and groom by first marriage 
and remarriage, race, and residence or nonresidence in State of occur- 
rence. The report on detailed statistics of divorce” includes tables on 
legal grounds for divorce, party to whom the decree was granted, 
duration of marriage, and numbers of children reported. Both of 
these reports contain a number of tables with detailed cross tabulations 
of the data. From time to time special studies are published, the most 
recent being an analysis of seasonality in marriage licenses. Since 
1946, statistics on marriages and divorces in the United States have 


? “Recent Increases in Marriage and Divorce," American Journal of Sociology, January, 1939, 
551-54. 
8 Estimated Number of Marriages by State: United States, 1937-40, Bureau of the Census, 1942; 
and Estimated Number of Divorces by State: United States, 1937-40, Bureau of the Census, 1942, 

* Marriage and Divorce in the United States, 1937 to 1945, National Office of Vital Statistics, Vital 
Statistics—Special Reports, Vol. 23, No. 9, 1946. 1 ESSE 

10 See “Monthly Vital Statistics Report” Vol. I, 1952 an „п, D ^ 

U Statistics оп Morris: Specified States, 1950, National Office of Vital Statistics, Vital Statistios— 
Special Reports, Vol. 37, No. 5, 1952. 

12 Statistica on Divorces and Annulmenls: Specified States, 1050, National Office of Vital Statistios, 
Vital Statistics—Special Rej Vol. 37, No. 4, 1952. ‘ 

їз Seasonal ero. гая. Licenses, National Office of Vital Statistics, Vital Btatistice— 
Special Reports, Selected Studies, Vol. 33, No. 12, 1952. . 
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also been included in the annual volumes of the National Office of 
Vital Statistics.“ 

Turning specifically to questions of improving national marriage 
and divorce statistics it seems clear that to provide comparability for 
figures prepared under many independent jurisdictions located in every 
part of the country there must be careful agreement regarding standard 
procedures, report forms, and definition of important terms. While 
many individual States have excellent statistical programs, consider- 
able difficulty is encountered in bringing these together into meaningful 
national statistics because of the variation in the State report forms. 
Thus, on the majority of State marriage certificates there is an item 
concerned with “number of marriages.” This is variously worded 
“number of previous marriages,” “number of marriages,” “number of 
proposed marriage,” and “any prior marriage.” About one-third of the 
State certificates do not contain this item. Similar variations may be 
noted in questions on "occupation," "birthplace," and several other 
items. There is nothing surprising about these variations in wording 
except that they are not more extensive. 

Progress is being made toward the necessary marriage and divorce 
standard certificates (or statistical report forms). The Working Group 
on Marriage and Divorce Registration of the Conference, which in- 
cludes some of the leading State registrars as well as representatives 
of important users of marriage and divorce Statistics, has prepared 
suggested minimum lists of items to be ineluded on the standard 
certificates. In order to have available а comprehensive picture of 
the needs of consumers and producers of marriage and divorce statis- 
ties, questionnaires listing the minimum items suggested by the Work- 
ing Group and a few frequently proposed additions, were distributed to 
& representative list of users of these statistics and to all the State 


^ Beo “Vital Statisties of the United States, 


” Part I, i i isti i 
Health Service, Department of Health, Education National Office of Vital Statistics, Public 


‚ and Welfare. Government Printing Office, 1948-51. 
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to be interested in these problems, and to representatives of a few 
private agencies and federal government agencies who are thought to 
have a special interest. We call these the consumer group, although the 
list is far from complete, an important omission being the consumers 
in business organizations. There will be other samplings of the opinions 
of consumers of these statistics, and we shall welcome suggestions of 
persons or groups that should be queried. 

On most of the items there was a large measure of agreement among 
the respondents. This is natural since there is a clear need for such 
items of identification as name, place of residence, and age. Users of 
these statistics frequently request data, now unavailable, on the 
social characteristics of the persons involved. Disagreement is noted 
when proposed questions are not clearly needed for identification pur- 
poses. Two possible items will serve to illustrate the problems of 
preparing good report forms in this field: “occupation and industry” 
and “last grade of school completed.” On both the marriage and 
divorce questionnaires these items were listed and respondents were 
invited to comment, as well as to check “yes or no,” whether each item 
should be included in the standard certificates. 

The item “occupation and industry,” was discussed in the Working 
Group, and there was some support for it, though not majority sup- 
port. Moreover, several States have this item on existing forms. Re- 
sults of the questionnaires reflected divided sentiment in the registrar 
group: of those expressing a positive opinion, a small majority favored 
its inclusion on the marriage certificate, while sentiment was almost 
evenly divided regarding its inclusion on the divorce certificate. In the 
consumer group there was а strong majority for including the item 
“occupation and industry” on both certificates. On the other hand, for 
the suggested question “last grade of school completed” the registrar 
group was opposed to its inclusion by a large majority, while among the 
consumer group an even larger majority favored its inclusion. 

There are many sides to the question of whether a given item should 
be included on a standard certificate. One asks first who needs this 
information and for what purpose; obviously, information 18 not 
gathered to satisfy idle curiosity. Moreover, the ease and accuracy 
with which information can be recorded and tabulated are important 
since thousands of local officials in all parts of the country must record 
it. A number of experienced statisticians have pointed to the practical 
difficulties the “occupation and industry” item will raise. Perhaps these 
can be overcome as each State registrar gives detailed instructions to 
local officials in his State. On the other hand, a question concerning 
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“last grade of school completed" would seem to offer slight possibility 
of being misunderstood and would be easy to tabulate. 

One must also ask whether the persons required to complete the 
forms will object to certain items. Will persons applying for a marriage 
license resent being asked to state the last grade of school completed, 
or to give their present occupation and industry? In the final analysis 
these and similar questions can only be answered by actual field tests. 

Tn response to a request accompanying the questionnaires, a large 
number of suggestions of new items to be included on the report forms 
were received. Consideration will be given first to the marriage ques- 
tionnaire. In the consumer group, nearly one-third of the suggestions 
asked for religious preference of the persons to be married. Other 
suggestions, in order of frequency, concerned physical qualifications 
for marriage such as results of health examinations, facts concerning 
children by previous marriage, details regarding the marital records 
such as date of first marriage, economic status such as income, various 
facts concerning parents, and other items. The registrar group men- 
tioned most frequently the desirability of more facts about the parents, 
such as name and birthplace. This suggestion was followed in f. requency 
by items concerned with legal status and identification, religion, the 
marital record, items useful for follow-up activity to complete the 
records, and other items. 

The suggested additional items for the divorce form had many simi- 
larities to the list for the marriage form. The consumer group stressed 
religious preference, more than one-third of the total falling here. 
Other suggestions, in order of frequency, concerned the divorce action, 
such as facts about alimony, the children affected, the marital record, 
economic status, data regarding parents, birthplace, education, and 
other items. The registrar group suggested ‘additional facts on the 
aed action, regarding the children, religious preference, and other. 
items, 

Since the forms must be reasonably brief and easy to complete and 
tabulate, it is clear that some difficult decisions must be made regard- 
ing items to be excluded. There are important differences in the sug- 
gestions received from the consumers and the producers of these sta- 
tistics. This was to be expected. In general, the State registrars, 
having Special knowledge of local officials’ problems, place greater 
emphasis on items essential to identification and on the desirability 
of keeping the forms brief. The consumers, not concerned with local 
limitations, ask for items that will make possible the detailed break- 
down of the gross figures that will make them more meaningful. 
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Results of the questionnaires will also be examined by the Working 
Group on Marriage and Divorce Registration, and the best possible 
compromise will be sought between the points of view of the producers 
and the consumers. Preliminary forms will be prepared and studied 
before issuing the standard certificates. It is not assumed that every 
State will use identical forms; it is hoped that every State form will 
contain all items considered essential. Probably many States will add 
items to the standard list, or will retain the additional items now con- 
tained on their forms. 

Another major phase of the program concerns the plan to establish 
Registration Areas for marriage and divorce comparable to the long- 
familiar Birth and Death Registration Areas, The Registration Areas 
are the suggestion of the Working Group and grow out of the experi- 
ence of State registrars with improving other vital statistics. Many of 
the State registrars are very optimistic regarding the rate of growth of 
the Marriage and Divorce Registration Areas once they are established. 
It is planned to publish statistics from Registration Areas in greater 
detail than will be possible for the remaining States. Experience during 
the development of the Birth and Death Registration Areas indicates 
that the States desire to be members of ә. Registration Area because 
of the wide use made of the statistical reports for such areas. 

The original States that will make up the Marriage and Divorce 
Registration Areas will necessarily be limited to States with central 
files of marriage and divorce records. The accompanying table indi- 
cates which States, at this time, maintain central files of marriage and 
divorce records. 

Three fundamental questions may be raised concerning future de- 
velopments of this program: ў у 

First.—How shall consistency checks be made of periodic reports 
of marriages and divorces? Since a legal marriage or divorce inevitably 
requires formal registration, it may appear that there is no urgent 
need for consistency checks. Recording at the local level, of course, 
may not result іп recording at State or federal levels; and consistency 
checks, with emphasis upon routine procedures and record-keeping, 
are essential. 1 id 

As a first step in this direction the National Office of Vital Statistics, 
in cooperation with the Working Group is preparing a procedural 
manual for use by State registrars. This will set forth the necessary 
steps for registration, follow-up by letter or field agent, record process- 
ing, and all aspects of the work needed to produce comprehensive 


national marriage and divorce statistics. Obviously, this manual must 
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CENTRAL FILES OF MARRIAGE AND DIVORCE RECORDS 


(States, Independent Registration Areas, and Territories indicating 
those maintaining Central Files of Marriage and Divorce Records) 


Marriage Divorce Marriage Divorce 
pe Keedie Records Areas Records Records 

Alabama x x Montana x x 

Arizona Nebraska x x 

Arkansas x x Nevada 

California x New Hampshire x x 

Colorado New Jersey x 

Connecticut. E E New Mexico 

Delaware x x New York x 

Dist. of Columbia North Carolina 

Florida x x North Dakota x X 

Georgia x x Ohio x х 

Idaho x x Oklahoma 

Illinois Oregon x x 

Indiana Pennsylvania x x 

Towa x x Rhode Island x 

Kansas x x South Carolina x 

Kentucky South Dakota x x 

Louisiana x Lx Tennessee x x 

Maine x x Texas 

Maryland x x Utah x (а) 

Massachusetts x x Vermont x x 

Michigan x x Virginia. x x 

Minnesota Washington 

Mississippi X x West Virginia x 

Missouri x x Wisconsin x x 
Wyoming x x 

Independent g ч 
Registration 
Areas Territories 

New Orleans. x x Alaska x x 

New York City Hawaii x x 
Puerto Rico x x 


Virgin Islands 


(в) Law for the central filing of divorce records recently enacted. 
better all of these States can be countéd upon to enter the original Registration Areas will de- 
m DL eir ability to meet certain essential requirements, Tt will be necessary, of course, that 
maintain a current file of marriage and divorce records, reasonably complete, and that the State 
agree to provide statistical data to be used in preparing the national summaries. 


Tepresent the pooling of experience of many registration officials. 
Шар onsibility for carrying out periodic checks must rest with the 
State registrars since they have responsibility for the completeness 
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and comprehensiveness of the vital statistics in the various States. 
In this important work the National Office of Vital Statistics will be 
glad to cooperate to the limit of its resources. In due time a body of 
useful knowledge that can be applied generally will be developed. 

Second.—How shall national marriage and divorce statistics be 
compiled? In recent years pretabulated data supplied by the State 
registrars have been the principal source of national statistics. The 
number of States that can provide the necessary detailed tables has 
been increasing and there is every reason to believe that this trend will 
continue. Pretabulation of data by the States is the simplest method 
and the least expensive. At the same time the chief burden is thrown 
on the State offices, which in some cases may not be in a position to 
carry it. It is also much more difficult to get consistency through pre- 
tabulated data and there is a certain inflexibility in this method. It 
makes impracticable any additional study of the basic records once the 
routine tabulations have been completed as this would require that every 
State prepare new tabulations. There are no present plans to modify 
this procedure. 

There is a second method by which national statistics could be 
compiled. Microfilm copies of marriage antl divorce certificates could 
be purchased from each State and the tabulations made in the National 
Office. This method is familiar to State registrars who use it for birth 
and death certificates. Punch cards are prepared in the National Office. 
Punch cards of births in Illinois are prepared by that States’ Bureau 
of Statistics, for use by the National Office. Several other States are 
considering the possibilities of such a cooperative program with the 
States furnishing punch cards of births to the National Office. 

Third.—To what extent can sampling methods be used in preparing 
national marriage and divorce statistics? One can answer this question 
only on the basis of actual field tests. In may appear that there is no 
need for sampling since every effort is being made to secure registration 
of all marriages and divorces. However, even the most comprehensive 
report forms will leave many questions unanswered. Study of the 
suggested items written on the questionnaires makes this clear. The 
decennial population censuses make increasing use of sampling, and our 
situation is analogous. Some of the most important social characteris- 
tics can only be obtained on a sampling basis. Plans have been drawn 
to use the 25,000 households of the Current Population Survey of the 
Bureau of the Census. It may be possible, also, through sampling 
methods, to provide a check on the totals obtained. by registration. Tt 
should be emphasized that the practical difficulties to be encountered 
can be determined only by field tests. р $ 


SAMPLING THE FEDERAL OLD-AGE AND 
SURVIVORS INSURANCE RECORDS* 


B. J. MANDEL 
Bureau of Old-Age and Survivors Insurance 


INTRODUCTION 


T THE end of 1952, records were available for about 100 million 
А accounts with some wage credits under the old-age and survivors 
insurance program since January 1937; for more than ten million em- 
ploying organizations which had reported wages under the program ; 
and for over five million individuals who were receiving either retire- 
ment or survivors benefits under the program. It is apparent from 
this quantitative picture of the vastness of these records, that only a 

_ sound and flexible system of sampling could tap the information con- 
tained in them without undue expense and delay. The purpose of this 
paper is to describe the sampling systems used for tabulating statistics 
from the old-age and survivors insurance records and the associated 
methods and problems of estimation. 

Some statistics become available on a 100-per cent basis, without 
extra cost, because they are part of the controls in the accounting op- 


quarter of 1952; that some 53 million wage items were listed on these 
Teports, with aggregate taxable wages amounting to $33 billion for the 
quarter; that 1.1 million employee account numbers were issued in the 
third quarter of 1952 and that 150,000 new employer identification 
numbers were assigned in that quarter. . 

However, while these easily-obtained accounting totals furnish 


America" submitted to the United Nati mie tes 

jamplin, Fifth tions Statistical Commission, Sub-Commission on Statistical 
кы у ү зр Calcutta, India, December 19, 1951. The writer wishes to thank Irwin 
valuable comments, -Age and Survivors Insurance for critically reading the paper and making 
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records, such as age and insurance status of workers, amount of monthly 
benefits by State of residence of the beneficiary and industrial activity 
and size of business organization. As a rule, such detailed data are not 
essential for the accounting and benefit paying processes. Not being 
an integral part of these operations, therefore, the data can be obtained 
only by independent tabulations either of the entire universes or sam- 
ples drawn from them. 


THE UNIVERSES AND NEEDED FACTS 


To appraise the effectiveness of the social security provisions in 
providing economie security, to administer various aspects of the 
program, to formulate policy and legislation, and to forecast income to 
and outgo from the insurance fund, accurate and current information is 
needed on a variety of subjects. Vitally important is information on 
the number and characteristics of persons covered and insured under 
the program, as well as on the number and characteristics of persons 
and families receiving benefits. These two types of statistics are repre- 
sented by two distinct universes, namely, employees with wage credits 
and persons in receipt of benefits. Facts gleaned from the basic files 
for these individuals could shed light on stich questions as: How many 
persons are contributing under the program and what is the amount 
of their contributions? How many have worked in covered jobs suffi- 
ciently long to be insured, and how much is their average monthly 
wage or potential benefit amount? How many insured persons are 
approaching the retirement age of 65? How many families are already 
in receipt of benefits and how much are they receiving? 

Also of importance, particularly in administrative planning, is infor- 
mation on the number and charaeteristics of employing organizations 
reporting under the program, the third universe for which statistics 
are tabulated. These tabulations could answer such questions аз: How 
many new businesses are started in a month or a quarter? How many 
are discontinued? How many are currently operating and what are 
their characteristics, such as their size, industrial activity, location, 
aggregate employment and payrolls? / 

Thus, altogether, statistics are needed about the size and charac- 
teristics of three different universes commonly known as employees, 
beneficiaries, and employers, each of which contains millions of indi- 
vidual items. Tabulations from one of the three universes are currently 
being made on a 100-per cent basis, namely, the file of new employer 
numbers issued and the file of currently reporting employers. (The file 
of business deaths is sampled on a fifty-per cent basis.) In addition, à 
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special annual tabulation of beneficiaries by county of residence is 
made оп а 100-per cent basis. Тһе data from these 100-per cent tabu- 
lations are used widely as bench marks or censuses both in the Social 
Security Administration and in other government agencies. The costs 
of these tabulations have not been large in the light of the required 
detail and accuracy. Furthermore, the requirements could not have 
been met more economically through sampling. . 

The remaining two universes, namely, workers and beneficiaries 
(with the above exception) are sampled both periodically and inter- 
mittently for special studies. 


SPECIFIC SAMPLING SYSTEMS AND SIZES 


The system of identification and controls.—A nine digit account num- 
ber is used to identify all persons who receive wage credits under the 
program, including those on whose account benefits are paid. Employ- 
ers making reports are also identified by a nine-digit number. In both 
cases, the number is issued generally at the time of first coverage under 
the program, and it serves as a basis for identifying the employer so 
long as һе remains in business and the employee throughout his working 
lifetime, including the period'of benefit payments to him or his depend- 
ents. The following is the composition of these numbers: 


Employee Account Number 


000 00 


Three digits representing 
the geographical area of 
issuance. There are 612 
area numbers in use at 
present. 


00 


Two digits representing a 
group or sequence of num- 
bers. One hundred groups 
of numbers—00 to 99— 
can be issued in any area. 


Employer Account Number 


0000 

Four digits representing 
the serial number. 10,000 
numbers can be issued for 
each group in any one 
area. Therefore, one mil- 
lion numbers can be is- 
sued in a single area. 


0000000 


Two digits representing area of issu- Seven digits representing the serial. 
ance. There are 68 such areas desig- "Therefore, 10 million numbers can be 


nated as Collector of Internal Revenue issued in any one district. 
districts. 


When. the Social Security Administration adopted а numbering sys- 
{еш to identify employees and employers, it also set up procedures 
for the issuance of employee numbers with predetermined three digits, 

‘For a fuller description, 


В вее “Technical Problems Involved in the Administration of Social 
Security Schemes—M; inch pr елешн ыы 
НЕШ нан MEM 9f Methods of Identification of Insured Persons and Organization of 


ph Number I, pages 209-216, published by the United Nations. 
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and two digits in the case of employers, in specific areas. Thus, for 
example, all social security account numbers with 318 through 361 as 
the first three digits were issued to individuals in the State of Illinois 
and all employer identification numbers with 52 as the first two digits 
were issued to employers in Maryland. It also set up procedures for 
avoiding the issuance of more than one number to an individual or 
employer, by a screening process, and for issuing numbers in strict 
numerical sequence [1]. 

The over 500 field offices of the Bureau of Old-Age and Survivors 
Insurance throughout the country, including Alaska, Hawaii, Puerto 
Rico and Virgin Islands issue account numbers to individuals as they 
apply for them. Controls over the specific numbers to be issued and the 
methods of issuance are set up and maintained by the Bureau’s central 
office in Baltimore. At present the account numbers are released to the 
field offices in multiples of 500 of which, for reasons explained later. 
20 per cent contain either a 49? or “7” in the first place of the serial. 
Thus, if a field office is assigned 500 numbers to issue, 100 numbers are 
in the “2” or “7” series; if an office is assigned 5,000 numbers, 1,000 
numbers contain either a “2” or “7” in the first place of the serial. The 
field offices must issue numbers consecutively, starting with the lowest 
number of the series assigned to them. Prior to October 1940, blocks of 
numbers of unspecified sizes arranged in numeric sequence were Te- 
leased to field offices. Furthermore, some of the large employing organi- 
zations were given whole clusters of numbers to issue directly to their 
employees, so as to relieve the initial registration workload on the 
administration. Because clusters of consecutive numbers were issued 
to groups of people in the same employing organization, some serial 
correlation was introduced. However, it is logical to assume that the 
effects of this serial cofrelation have been substantially reduced over 
the past fifteen years, because of interemployer and interindustry shifts 
of employees, deaths and retirements [2]. Of course, serial correlation 
is introduced even without employer issues, because of the issuance of 
numbers in numerical sequence. This correlation also diminishes with 
time. 

Method of sampling and sample sizes.—The system of sampling the 
social security records for employees and beneficiaries is based on the 
last four digits (the serial) in the account number and is geared to the 
administrative operations, so аз to derive statistics wherever possible 
as а by-product of the accounting work. Consequently, sample selection 
forobtaining data on employees is generally restricted to à sub-universe 
of 20 per cent of the accounts to which wages are posted for а full 
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calendar year at one time. The Bureau of Old-Age and Survivors In- 
surance split up its entire file of accounts into four parts, so that the 
job of posting wages to the employee's credit could be spread out uni- 
formly over the twelve months of the year, instead of accumulating a 
peak posting load once each year and experiencing slacks during 
other parts of the year. The posting group which could provide 
calendar-year data as а by-product of the accounting operations was 
chosen as the sub-universe for sampling. The remaining three groups 
included wages for four calendar quarters which spanned two consecu- 
tive calendar years, Since most economie analyses are based on a 
calendar year rather than on any other twelve-month period, this group 
was preferred over the others that were available. 

By virtue of previous planning, this sub-universe consists of all 
accounts having ^2" or ^7" as the first digit of the serial, or а 20-per 
cent sample. Selection was made on the basis of the first digit in the 
serial to economize on the sorting operations. Since all accounts are 
filed in numerical sequence for accounting purposes, blocks of 1,000 
accounts could be withdrawn at a single time for statistical tabulations 
without additional sorting, by choosing a digit in the first place of the 
serial. It should be noted <hat this procedure yields a sub-universe 
for sampling which is composed of clusters of 1,000 numbers each, 
which, under current procedures, are made up of smaller clusters of 
100; 200; 300 and so forth up to à maximum of 1,000 numbers issued 
consecutively. At this time, the sub-universe of accounts with wage 
credits includes over twenty million accounts and is too large for 
tabulation of data on employees. Consequently, samples of different 
sizes are drawn from it to provide data of different kinds. 

The largest sample for employee data is known as the One-Per Cent 
Continuous Work History Sample [3], presentiy comprising about one 
million accounts nationally. This sample provides information on the 
Wage and employment characteristics and other classifications of the 
100 million accounts over the working life of the individual beginning 
with 1937. Another sample, the One-Per Cent Annual Employee 
Sample, which is part of the Continuous Work History Sample, con- 
tains about 500,000 to 600,000 accounts nationally and furnishes 
information on the industrial distributions, earnings, age and other 
Wage and employment characteristics of the workers in the latest year. 
‘A third sample, which is a sub-sample of the work history sample 
Geto сый oe Per Cent Advance Sample, includes about 
+ ats Vell punts nationally and provides selected quarterly, annual 

8 well as work history data on employees under the program needed 


{ 
mn 
| 
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for current and long-range estimating. On some special occasions use 
is made of a national 0.02-per cent sample comprising about 20,000 
accounts.” 

All of these varying-sized samples were selected from the 20-per 
cent sub-universe, and smaller-sized samples were selected as internal 
segments of the larger-sized samples, so as to economize as much as 
possible on sample selection and maintenance. Thus, a four-per cent 
sample used some ten years ago for tabulating 1937-40 work history 
data was a 20-per cent sample of the 20-per cent sub-universe, compris- 
ing accounts with the digits ^0" or “5” in the last column of the serial, 
This method included two account numbers out of every 10 in the sub- 
universe, and provided a systematic 20 per cent sample from the sub- 
universe of 20 per cent, or a four-per cent sample. 

Only the first and last digits of the serial number were relied on to 
get 20-per cent and four-per cent samples. However, for the three-per 
cent samples used for tabulating 1941—44 annual employee data, and `, 
the one-per cent, 0.1-per cent and 0.02-per cent samples used currently, 
selection was on the basis of internal as well as external digits of the 
serial. The three-per cent sample was obtained by splitting the four-per 
cent sample into two segments of one per eent and three per cent. The 
one-per cent segment was composed of accounts with a “2” or prid n 
the first place of the serial and 05, 20, 45, 70, or 95 as the last two digits 
of the serial. Since the eighth and ninth place of the account number 
for persons in the four-per cent sample contained 20 possible numbers, 
(namely, 00, 10...90 and 05, 15... 95), selection of five of them 
provided a fourth of the foür-per cent sample, or one per cent. The 
three-per cent sample, of course, was the residual segment after the 
one-per cent was sorted out of the four-per cent sample. 

To obtain the 0.02-per cent sample, the first step was to select from 
the aforementioned five groups in the one-per cent sample, the group 
that contained the digits 05 in the last two places. One-fifth of the one- 
per cent sample, or a 0.2-per cent sample, was thus obtained. Selecting 


2 While the percentage sample size of all these samples is small, the samples are actually large in 
terms of absolute size. These large samples are justified by the multi-purposes they serve and the great 
variety of classifications and cross-classifications for which data are needed. For example, in order to 
estimate the number of old-age retirement claims in each field office, data are needed on the number of 
workers between the ages 60-64 who are insured under the program. While this number was 35,000 
in the one per cent sample as of January 1950, for the entire country, the average per field office was 
about 70 workers. As another example, data were needed for o cancer research project on the number of 
workers in the rubber industry in Ohio in 1949, by age and sex. The опе-рег cent sample contained a total 
of 900 workers in this study. Experience has shown that it is less expensive to maintain a sample on 8 


quirements. However, the need for continued maintenance of the large sized samples is regularly ia 
examined and further research on the optimum sample sizes is continuing in the Bureau. 


. . 
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from this segment only the accounts with the digit five in the seventh 
place of the account number (or second place in the serial), yielded one- 
tenth of the 0.2-per cent segment, or a sample of 0.02-per cent. This is 
a systematic sample, since it consists of every five-thousandth number 
in the account number population and is selected proportionately from 
each area. 

Recently, the 0.02-per cent sample was increased to a 0.1-рег cent 
sample. This was accomplished by adding to the 0.02-per cent sample, 
which already contained the digits 2505 and 7505, all the additional 
numbers which had a “5” in the second place of the serial out of the 
foregoing one-per cent sample, namely, the digits 2520, 2545, 2570, 
2595, 7520, 7570, 7545, and 7595. 

Тһе chart on page 469 shows the specific digits in the serial number 
whieh are used for selecting the various sized samples of employees 
and beneficiaries. 

Tt is, of course, not necessary to use precisely this same combination 
of numbers in the serial to obtain systematie samples of the specified 
sizes. However, in deciding on the specific digits, considerations of 
economy in sorting have dictated the choice. With the file already in 
numerical order, the first digit in the serial was used to obtain the sub- 
universe in order to avoid extra sorting. Furthermore, where sorting 
costs are equal, digits are selected so as to give as wide a dispersion as 
possible, in order to reduce the effects of serial correlation. 

The 20-per cent sample from which are tabulated most of the data 
on beneficiaries under the program consists of the accounts that fall 
in the sub-universe of 20-per cent. Thus, all accounts with a “2” or 
“7? in the first place of the serial on which either old-age or survivors 
benefits are paid are included in the sample.* This sample may be 
described аз a systematic sample of clusters in’ order by area and date 
of issuance of the account number. There are two other samples which 
are digitally selected in the same way, namely, the 20-per cent and one- 
Per cent samples of persons who receive account numbers each quarter. 


METHODS OF ESTIMATION AND MEASURING ERROR 


Estimating totals —Both sampling and other types of variations need 
to be taken into account in preparing estimates from the sample 
tabulations. Furthermore, the methods of estimation differ, depending 
upon the uses of the data, costs and availability of control totals. 
Because the old-age and survivors insurance digital samples are of 


3 Plans have rei T j 
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SPECIFIC SERIAL NUMBER DIGITS USED TO SELECT 
STATISTICAL SAMPLES OF EMPLOYEES AND 
BENEFICIARIES UNDER THE FEDERAL 
OLD AGE AND SURVIVORS PROGRAM 


Size of Sample 
20 4 1 0.1 0.02 
Universe: Per cent- Per cent- Per cent— Рег cent— Per cent— 
Alldigitsinthe Digits2or7in DigitsOor5in Digits 20, 70, Digits520,570, Digit 505 in last 
serial first place last place 05, 45, or 95in 505, 545, or 595 three places 
last two places in last three 
places 
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all classifications, methods of estimation 
ed uses of the data (e.g. in determin- 
are usually simple and 


uniform size throughout 
from the sample data, for unrefined u: 
ing general relationships and magnitudes), 
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straightforward, namely, by use of the reciprocal of the sampling ratio, 
Thus, the 20-per cent sample of newly issued accounts can be inflated 
to the universe total by a multiplier five. The one-per cent sample of 
workers with wage credits can be similarly inflated by adding two 
zeros to the sample figures (both of workers and wages). Likewise, the 
0.02-per cent and the 0.1-per cent sample data can be multiplied by 
5,000 and 1,000 respectively, to yield estimated totals. Since the old- 
age and survivors insurance samples are self-weighting, computation of 
derivative measures, such as averages, percentages, ratios, coefficients 
of variation and other statistical and analytical measures is made 
directly from the sample distributions. 

While inflation of the sample data for the many thousands of cells 
to obtain statistics on approximate magnitudes is accomplished by 
simply using the appropriate sampling ratio, as indicated above, devi- 
ations from this method are relied on when greater precision is needed 
in estimates for selected cells. One deviation from the use of the above 
sampling-ratio method occurs when an actual universe total is avail- 
able as а by-product of the accounting or claims control operations. 
In such instances, estimates are prepared not by the probability ratio 
but instead by use of ratio estimates. 

One important universe control figure, derived from accounting data 
is the total amount of taxable wages paid in each calendar year. For 
selected estimates, this control figure is divided by the figure on total 
taxable wages in the sample, and the resulting ratio is multiplied by 

. the sample figures on wages. Thus, a ratio estimate is used in inflating 
sample data on wages of workers in selected classifications. It would, 
of course, be desirable to use this method for inflating all sample cells, 
dug would be too expensive to do for the thousands of cells tabu- 
ated. 

In dealing with the 20-per cent sample of beneficiaries and previously 
employed persons represented in benefit awards, where the number of 
cells for which data are published is smaller than in the case of the 
employee samples, universe totals for each type of benefit (six types, 
such as retired worker, aged wife, widow, ete.), obtained from the 
claims control records, are used. Therefore, the 20-per cent sample 
data are inflated to a 100-per cent basis by means of several different 


universe control figures which provide ratios for the different type-of- 
benefit groups. 


^ the case of the samples of employees, the estimates for selected 
d s are frequently adjusted for the exclusion from the sample tabula- 
ion of late-procéssed or delinquently-filed wage reports; in other 
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words, the reports which are not in file at the time of sample selection 
and tabulation. These adjustments are generally made after supple- 
mentary data are obtained later from special sample tabulations of 
information filed in late reports. In some instances, the estimates for 
selected classifications are adjusted on the basis of tabulations of de- 
linquently-filed or late-processed reports for previous years, by as- 
suming the same proportionate adjustment as for previous years 
applies to the current-year data. Thus, in making estimates from the 
social security samples of workers and beneficiaries most of the prob- 
lems of measuring the bias of late-response are manageable as a result 
of the tabulation and analysis of late-filed tax returns.‘ 

One additional faetor for which correction is necessary in the 
estimates of employees is the issuance of multiple accounts. As previ- 
ously mentioned, the sample to provide data on employees is drawn 
from the universe of social security accounts. In estimating the number 
of workers it is not known precisely to what extent the sample of 
accounts with wage credits differs from one on individual workers, 
because of the existence of an unknown number of multiple accounts. 
Despite all efforts to avoid issuing more than one number to an indi- 
vidual, it is known that an undetermined number of persons, for one 
reason or another, have more than one number, That fact alone, of 
course, would be no cause for concern in using the worker data, were 
it not for the fact that some of these persons have wages credited to 
more than one account, and, therefore, they have a greater chance of 
being included in the old-age and survivors insurance samples. Infla- 
tion by use of the reciprocal of the sampling ratio causes some over- 
statement in the estimated number of workers. In addition, there is 
understatement in the person's wage credits, because in some cases 
an individual’s wages are credited to two or more accounts which are 
not combined in the sample. Many of these multiple accounts are 
discovered as part of the regular accounting operations, thus forming a 
basis for measuring part of the bias [4] and partly adjusting the esti- 
mates. A special study is currently under way to develop more infor- 
mation about the bias in the estimates of workers due to the factor 
of multiples. EAR i 

Measuring sampling variability—From the foregoing, it is obvious 
that the OASI samples are not unrestricted random samples. They are 


4 There are also some relatively minor problems of non-response and incomplete response, generally 
representing less than one per cent of the employment and earnings totals. However, Wo d 
аге made with employers by the Bureau as part of the regular administrative operations and. к ee 
are reflected automatically in the samples. Because the reports under the program are required by law, 
these problems are believed to be relatively insignificant. 
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usually two-stage samples. The first stage is the systematic selection 
of a sub-universe consisting of clusters of 1,000 items each from the 
universe of social security accounts. As previously indicated, this sub- 
universe is stratified by area. The next stage is the selection of digital 
systematic sub-samples from the sub-universe. These sub-samples 
inherit the area stratifieation of the sub-universe. To the extent that 
such stratification operates to increase the variability within the 
samples it tends to reduce the sampling error associated with unre- 
stricted random samples, as demonstrated by Lillian Madow [5]. On 
the other hand, the selection of clusters of items for the sub-universe 
operates to increase the sampling error. It is not known definitely 
Whether the sampling variance from this type of sample is greater or 
smaller than that of unrestricted random samples. [6]. Nevertheless, 
in the absence of data to measure sampling variance by more acceptable 
methods, use has been made of the variances which would result from 
unrestricted random samples as an approximation of sampling error 
in the data. To improve on this method plans are being made for meas- 
uring sampling variance by the use of several sub-samples of equal size 
to be taken from the one-per cent Continuous Work History Sample 
[7]. Results from this study however, are not expected to become 
available for about a year. 

Sampling variability has been studied in a qualitative way by com- 
paring data derived from different-sized samples. It is believed on the 
basis of the comparisons between the data in the various samples that 
sampling error in the OASI data is not far from that which would be. 
expected in random samples. 


Detecting non-sampling errors.— While Studies to measure sampling 
variability in quantitative terms are primarily in the planning stages, 
studies of non-sampling error (such as coding, punching or tabulating 
errors) have progressed further. There are several methods in use. 
From time to time comparable data become available from different 
samples, thus making possible comparison of estimates or distributions 
derived from these samples and the computation of measures of sig- 
nificance (such as chi-square tests). These tests of significance are 
made despite the fact that they are not strictly applicable to the 
OASI type of systematic samples, on the assumption that at a 99-per 
cent confidence level they would generally aid in detecting probable 
non-sampling errors. The following table illustrates this type of a com- 
ics of data derived from the one-per cent and ene-tenth-per cent 
samples. 


It is apparent by inspection of the percentages that a very high 


FEDERAL OLD-AGE AND SURVIVORS INSURANCE RECORDS 478 


correspondence exists between the age distribution of employees for 
the same year as tabulated from the two samples. Statistical tests of 
differences did not reject the hypothesis (at а 99-per cent confidence 
level) that they came from the same universe. 


MALE WORKERS BY АСЕ, 1950, ONE-TENTH- 
PER CENT AND ONE-PER CENT SAMPLES 


Number Percentage distribution 

Age 0.1-рег cent 1-рег cent 0.1-рег cent 1-рег cent 
sample sample sample sample 
Тоба! 32,312 323,526 100.0 100.0 
Under 15 58 765 0.2 0.2 
15-19 2,355 24,027 7.8 7.4 
20-24 4,095 41,788 12.7 12.9 
25-29 4,486 44,577 13.9 13.8 
30-34 4,138 41,030 12.8 12.7 
35-39 8,806 38,592 11.8 11.9 
40-44 3,474 33,919 10.8 10.5 
45-49 2,786 28,056 » 8.6 8.7 
50-54 2,388 23,961 7.4 7.4 
55-59 1,924 19,345 6.0 6.0 
60-64 1,500 14,665 4.6 4.5 
65-69 813 7,988 2.5 2.5 
70-74 337 3,237 1.0 1.0 
75 and over 122 1,255 0.4 0.4 

Unreported 30 371 E = 


One other type of study to detegt non-sampling errors is made in 
analyzing statistics on néwly issued accounts to determine deviations 
from the prescribed procedures for issuing numbers. As indicated 
earlier, present procedures call for the issuance of 100 numbers contain- 
ing the digits two or seven in the first place of the serial out of every 
block of 500 numbers assigned to a field office, so as to reduce intra- 
class correlation within the clusters of 1,000 in the sub-universe of 20- 
per cent used for sub-sampling.[1]- Misapplication of this procedure is 
determined from a comparison of the total accounts issued in each 
State with the number of accounts containing the digit “2” or “7” in 
the first place of the serial. The expected size of sample is compared 
with the actual size and deviations which exceed either the maximum 
or minimum tolerance are investigated, and remedial action is taken 


[2]. 
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ADVANTAGES OF THE OASI TYPE OF SAMPLE 
Simplicity—The main advantages of the type of digital sampling 
used are apparent from the foregoing presentation. Foremost is the 
fact that this type of sample is simple to select and understand. In 
view of the fact that the clerical staff which usually performs the 
machine operations of sample selection is not trained in sampling, & 
simple, straight-forward procedure of selection avoids complications 
and trouble spots. It is also a simple process to inflate the sample data 
to the universe because the sample is self-weighting. 

Precision and accuracy.—4A second good feature of the sampling sys- 
tem is that it can take advantage of the method of issuin g numbers to 
yield area and time stratification and, therefore, results in better pre- 
cision than if such stratifications were not attained. Furthermore, con- 
trol lists are available showing the account numbers which have been 
issued. By checking the sample account numbers against these lists, 
missing account numbers can be added and incorrect ones removed. 
This leads to greater accuracy because it eliminates mechanical or 
clerical errors both of omission and commission. 

Fllezibility.—The digital sample serves many purposes because of its 
flexibility. For example, tliis type of sample is most appropriate for 
tabulating data on the work history and wage patterns of contributors 
under the program—a type of data essential for a continuing evalua- 
tion of the operations of the program. In compiling such data, it is 
necessary to have a sample which can be easily maintained by adding 
toit each year a sample of the new workers and by identifying deceased 
and retired persons. Replenishment of the sample by new workers is 
automatically accomplished by including a sample of the persons who 
currently receive new account numbers having the predetermined 
sample digits. Thus, for example, a person who obtains a new number 
having as its serial 2505 or 7505 is automatically added to the 0.02- 
Per cent sample. Furthermore, by maintaining a large general-purpose 
sample the digital system readily yields smaller samples, as the need 
arises for specific purposes, by mechanical sorting on selected digits 
Из ш Account number. Finally, the system affords a simple method of 
enriching the informational items about a given person covered under 
the social security Program by adding information easily derived from 
the records maintained by other programs using the same identifica- 
tion system, For example, because the Railroad Retirement Board 
P е "id nine-digit account number for identifying workers 
for identical Shin fr ER us s т еми sco 
This leads to mor tod Pacem саты Bet ше ге зашро 

е complete statistical series because it provides fuller 


, 
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information for persons with wage credits under both systems. Simi- 
larly, the State unemployment insurance programs rely on the nine- 
digit social security number to identify workers, and it is relatively 
simple to coordinate the statistical samples under the two programs. 

Economy.—Economy is also a major advantage of the sampling 
system. This is especially true if sample selection and tabulation is 
integrated with appropriate accounting and administrative operations. 
Economy in the compilation of data is also achieved because of the 
aforementioned statistical controls and lists which forestall in the 
early stages unnecessary review of inconsistencies which would show 
up later. 7 


CONCLUSIONS 


The OASI system of sampling for statistical tabulations is based on 
an area stratified sub-universe of systematically selected clusters with 
systematic sub-sampling. Little research has been done on this type 
of sample design, and further study is necessary. The following are 
among the more important areas of research: 1) Estimating more pre- 
cisely than by the use of binomial tables the sampling error for the 
OASI type of sample. The procedure to besused [7] is expected to pro- 
duce more accurate estimates of sampling variation. 2) Reducing the 
size of the various samples so as to economize still further in the 
statistical program, without losing essential data and sources for special 
tabulations. 3) Measuring the bias that is introduced into the employee 
(but not beneficiary) data when a sample of accounts to which wage 
credits have been posted is used to determine the number and charac- 
teristics of individual workers. The presence of multiple account 
numbers creates this problem which has been discussed earlier. 
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STATISTICS IN CHEMICAL EXPERIMENTATION* 


C. DANIEL 
New York City 


I. NEED FOR A MANUAL OF STATISTICS FOR CHEMISTS 


IGHTY-SEVEN per cent of the 140 respondents to a recent question- 
E naire (sent by the School of Chemical and Metallurgical Engi- 
neering at Cornell, to its graduates) indicated that а course in applied 
statistics should be part of the regular training of engineers. All-day 
Sessions of the American Chemical Society, and of the American In- 


stitute of Chemical Engineers, on statistical applications in their re- ' 


spective fields, are heavily attended. The chemical and engineering 
periodicals are publishing quite a number of papers showing examples 
of the uses of statistics, The bibliography of Hader and Youden on Ex- 
perimental Statistics [4], published in January, 1952, and containing 
some 150 references covering a three year period, can by now be con- 
siderably extended. At least twenty statistical texts published in the 
last few years clamor for the attention of the research worker in chem- 
istry. It may be concluded:that chemists and engineers are becoming 
increasingly aware of the usefulness of statistical methods. 
Considering this situation, and in view of the competitive demands 
of other developing fields, it is natural to ask for a book that will sum- 
marize the statistical contribution in a short, easily understood, tech- 
nically correct, comprehensive, manual—to bewell-printed, well-bound, 
and inexpensively marketed. It seems entirely safe to state that such 
a book will not appear. If it is short, it will not be technically correct; 
if it is comprehensive, it will not be. easily understood; and if it is well- 
made, it will not be inexpensive. Put in another way, a manual for the 
use of workers who need a convenient source of ready reference will 
be properly used only by those with considerable fluency in the field. 
A manual can hardly be an introduction. A serious introduction, in 
this field, can not possibly be short, if it is to carry the reader through 
from a weaning away from his prejudices about the fortuitous and the 
random, all the way to an understanding of modern experimental de- 
sign. 
But there is without doubt a need, and a growing one, for a manual 
of Statistics, one that can be used by chemists who have had some time 
to think about statistical notions; one which gives standard practices 


ЖА review article on W. L. Gore's Statistical M. ic imentati Res 
Interscience Publishers, Inc., 1952, рр, vii, 210, юю SA RUE. 
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and general rules, omitting derivations, but carefully including restric- 
tions and assumptions. Such а book may be called a cook-book but 
that need not be a term of opprobrium, provided we can find the sta- 
tistical counterpart of Escoffier. 

Similar books, applying statistical methods in other research fields 
(agriculture, textiles, medicine, educational psychology) have been 
found to be very useful. In two important respects, chemical research 
is very like the substantive fields just mentioned. In the first place, 
many factors are thought to influence the outcome under study and 
the researcher wants to know about all of them. Secondly, it is quite 
common to find that duplicate runs (not duplicate readings, but seri- 
ous attempts to repeat results under similar conditions) do not check 
very well. j 

It was indeed to these two aspects of agricultural and genetic re- 
search that R. A. Fisher addressed himself in developing his theory of 
the design of experiments. The extensions and simplifications of Fish- 
er’s ideas by J. Neyman and E. S. Pearson, make it entirely practicable 
to explain the major useful tools of modern statistics to chemists with 
little likelihood of misuse. Several important steps in this direction have 
already been taken. W. J. Youden [7] has written an introduction to 
statistics for chemists that is as beguiling as it is authoritative. K. A. 
Brownlee [2], and О. L. Davies [3], have carried us some distance fur- 
ther but there are by now major omissions in both books (e.g., operating 
characteristic curves, distribution-free statistics). 

With some experience of the interests of industrial chemists, and with 
some acquaintance with the current state of the art of statistics, it is 
not difficult to form a general picture of the contents of a useful man- 
ual. Such a manual would outling the meaning of statistical tests of 
significance, define andegive operating characteristic curves for the 
usual tests, and clarify with examples drawn from chemistry, the sev- 
eral warnings that must always accompany the reporting of “statis- 
tically significant (or insignificant) differences” to non-statisticians. It 
would also give, in terms entirely intelligible to most chemists, the 
meaning and use of confidence intervals for a wide variety of cases 
that are immediately usable. : HN 

Such a manual would then, presumably, proceed to the statistical 
part of planning experiments aimed at measuring the effects of several 
factors simultaneously. The algebra of the analysis-of-variance calcu- 
lations for quite general linear hypotheses is by no means too difficult 
for most research chemists and chemical engineers. By paying atten- 
tion to the assumptions (untested hypotheses) underlying each use 
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of the analysis of variance, it would be quite practicable to give re- 
search workers considerable aid in the planning and analyzing of some 
multifactor experiments. Little space would need to be given to the 
theory of the fundamental distributions, even though the results of 
that theory are often necessary for judging the results of experiments. 

The chemist who is looking for a manual such as that described will 
be greatly disappointed in Gore's book. It is the reviewer's opinion, 
based on several careful readings, that none of the needs indicated 
above is satisfied by this work. The harm that such a poorly prepared 
(and poorly edited) book may do is not easily overestimated. Perhaps 
the major hurt will be to those few chemists who find themselves 
prejudiced against the general field by contact with the specific un- 
fortunate example. 


Il. INTERSCIENCE MANUAL No. 1 


This book is the first in a new series of manuals which will provide, 
according to the publishers, “a straightforward description of labora- 
tory procedures and methods for the evaluation and recording of ex- 
perimental results.” The manual contains seven chapters, two appen- 
dixes, a glossary, a bibliography, and a subject index. 

The Introduction, Chapter I, starts with a discussion of the scope 

of statistical methods. On page 1 the author writes, “Thus the appli- 
cation of probability theory to define the nature of variability has led 
to techniques, called ‘Statistical Methods,’ whose useful function is to 
measure the uncertainty in inductive reasoning based on experimental 
data, This measure of uncertainty is a probability based on only the 
data at hand.” Since the reader is presumably a chemist in a hurry to 
get on to the usable results, perhaps these rather opaque sentences 
should not be criticized too adversely. On the other hand, it does seem 
too bad to promise something at the very beginning which cannot be 
delivered, and which has in fact long since been dropped as an objec- 
tive by all statisticians, 
) Тһе Second section of the first chapter summarizes “Ап Experiment 
in Variation? giving data from duplicate determinations of per cent 
Moisture, by each of 6 analysts, on each of 5 samples from a nylon 
dryer, on each of two days. This experiment will be discussed later. 

Chapter П on Statistical Concepts starts with a section on Fre- 
quency Distributions. The first two sentences read, “A frequency dis- 
tribution is measurement data of more than one article, sample, time 
of measurement, or occurrence of similar classification. Frequency dis- 
tributions may be divided into two types—‘populations’ or ‘universes’ 
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and samples taken from those ideally infinite universes.” Тһе distinc- 
tion that is intended is surely important, but the chemist-reader who is 
going through his first book on statisties will hardly be helped by being 
told that а frequency distribution *is measurement data," or that the 
second type of distribution is “samples taken from these ideally in- 
finite universes." The contrast then made between population param- 
eters and sample statistics is important and is clearly drawn, but un- 
fortunately the distinction, and the relation, is not carried through in 
the remainder of the book. The same symbol, s, is used for the popula- 
tion standard deviation and for the sample standard deviation, even 
in the table of areas under the normal curve. The equation stated to 
be that of the normal distribution on page 16 would be more nearly 
intelligible if in place of f/N we had some symbol for the probability 
density, if in place of s we had c, and if in place of х (the deviation of a | 
single value from the sample mean) we had X —. R 

Chapter III is on the Reliability of Estimates. Figure III-1, showing 
the distribution of single measurements, together with that of the av- 
erages of sets of four measurements, is misleading in that the area un- 
der the second curve is about one-tenth that under the former. The two 
areas should of course be the same. е 

Тһе definition of fiducial limits given оп page 24 is: “Тһе limits be- 
tween which one may have a given degree of confidence in what the 
true value (parameter) of a statistic will lie (sic) are called the fiducial 
limits.” There can hardly be many, chemists or others, who will find 
this language clear. À 

The description of Student’s t-test is confused by the author’s pre- 
senting two formulas for estimating the variance of the estimated dif- 
ference between two means. The first adds the estimated variances of 
the two means, calculated separately; the second is the one given by 
Student and by Fisher. The author recommends the-former, appar- 
ently on the grounds of ease of calculation. i 

The example given on page 32, of a "Student's t-test” is a t-test only 
by the accident of equal sample sizes. The same formulas, used with 
unequal sample sizes, would not give a t-test. The statement, “Some 
question exists as to whether a t-test is valid if the standard errors of 
the two means are considerably different” is erroneous. The relative 
magnitudes of the standard errors of the two means have nothing to do 
with the case. Very likely the difference referred to is that between the 
two population variances. But in case these differ widely there is no 
question about whether the t-test is valid or not. It is invalid. When 
the two sample sizes are equal it would seem more sensible, if widely 
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differing population variances are expected, or even possible, to esti- 
mate the population variance of differences. This would give a t of 3.08 
with 4 degrees of freedom, rather than the 3.67 with 8 degrees of free- 
dom reported by the author. 

On the general question of unequal population variances it is not 
apparent that the author is following any particular published research, 
nor is there any reference to the literature of this problem. A. A. Aspin 
and B. L, Welch [1] recommend the same statistic that the present 
author does, but then they also supply a table of critical values of the 
statistic. H. Scheffé [6] shows how to calculate a quantity that has the 
t-distribution with the maximum possible number of degrees of free- 
dom, but it is by no means the statistic given by Gore. It may be 
doubted that the casual solution of the Fisher-Behrens problem of- 
fered in this book will satisfy many statisticians. It is to be feared that 
it will satisfy too many chemists, who, unaware of the confused state 
of the art, may assume that the equations given in the manual have 
the sanction of wide acceptance. 

A short section on the F-test for comparing two sample variances is 
followed by two pages on “propagation of error.” Gauss’ equations, 
which are approximate in general even when the population parameters 
are known, are intended, but sample statistics are used, even when 
only four degrees of freedom are available for each of two sample vari- 
ances, as in the numerical example presented. It would be more nearly 
correct to form five ratios at random from the two sets of five values 
given, to use these as five estimates of the population ratio, and then 
to use a t-value of 2.78 for four degrees of freedom, instead of the 2.0 
used in the text, which assumes an exact knowledge of the two vari- 
ances, This procedure would give a confidence interval of half-length 
0.018, roughly three times the width calculated by the author’s use of 
Gauss’ equations. The reader who wants to follow the calculation as 
printed may be slightly confused by two minor errors in arithmetic and 
two misplaced exponents in the formula. No indication is given of the 
approximations implied in the derivation of Gauss’ equations, nor of 
their use to derive confidence intervals for rational functions other than 
the sum and difference, 

_ Chapter IV is on the Analysis of Variance. The assumptions of sta- 
tistical independence and of a linear model are not stated. However the 
assumption of constant error variance is rightly emphasized. It is the 
author's judgment that this latter assumption “appears to be satisfied 
within practical approximations by most experimental data sets.” He 
Proposes and gives Bartlett’s chi-square test for doubtful cases. This 
Practice has the disadvantage, pointed out by G. P. E. Box, that a test 
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sensitive to non-normality is being used to reject some applications of 
the F-test, the latter being in this respect a robust test, not sensitive to 
non-normality. Т 

An "analysis of variance" of the 120 observed moisture percentages 
in the experiment summarized above is then presented. The model be- 
ing used is not given, the expected values of the mean squares are not 
shown, nor is there any indication of how the “duplicate” measure- 
ments were made. If, as appears plausible from the context, the five 
samples are assumed to be a random sample from the dryer, then the 
so-called Mixed Model, first described, so far as the reviewer is aware, 
by Mood [5], is appropriate. 

The writing down of the expected values of the mean squares would 
have shown, before the experiment was carried out, that duplicates, 
even if properly taken, are of little value, since they can only be used 
to test the analyst-by-sample-by-time interaction. Put in the terms of 
the chemist: If there may be wide variation in the per cent moisture in 
various parts of the dryer, then this variability should be sampled, and 
not the variability of the moisture-determination; i.e. more samples 
should be taken from the dryer. In terms of the analyst of variance: 
Half the degrees of freedom, and so half the measurements made in this 
experiment, were used in judging the significance of a three-factor in- 
teraction. The three two-factor interactions, and the variability be- 
tween different parts of the dryer, could all have been better deter- 
mined, in one sense twice as well, if ten dryer-samples had been taken 
and no duplicates run. 

The author fails to mention randomization in allocating treatments 
to experimental units. As а result, in five of the seven examples in which 
interactions are calculated (pages, 58, 67, 78, 105, 107) the repo red 
error mean squares are of an order of magnitude smaller than the inter- 
action mean squares. It seems likely then that these error mean squares 
were all calculated from “chemists’ duplicates,” that is, from measure- 
ments made on parallel samples, which only in rare cases can be ex- 
pected to give an unbiased estimate of the effects of all the chance fac- 
tors operating in an experiment. On page 59 we read “This example 
demonstrates that a very fallacious estimate of the reliability is some- 
times given by considering only duplicate checks.” Unfortunately there 
is no mention of the method, which like so much else we owe to R. A. 
Fisher, of getting an unbiased estimate of error. i 

The fifth chapter is on the Design of Experiments. The first section 
attacks the dogma of the controlled, one-factor-at-a-time experiment. 
Measurements of а yield, y (actually the volume of a fixed weight of 
gas), were made in quintuplicate at each of three pressures, 400, 600, 
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and 800 mm., the temperature being 25 degrees Centigrade. A plot of 
y versus the reciprocal of the pressure, 1/P, gives a nice straight line, 
Then two (quintuplieate) runs were made at 400 mm. pressure, but at 
temperatures 100 and 200 C. Again a three-point plot versus 7 gives 
а Straight line. The author then combines the two equations for the 
two straight lines to get an equation of the form Y=a+biT+b./P, and 
shows that this equation predicts the yield at a pressure of 800 mm, 
and a temperature of 200 C very poorly indeed. But surely it would be 
ап unusual chemist who would combine the two linear equations in 
this way, for this would imply that he had forgotten Boyle’s and 
Charles's laws. The force of the example, planned to show the weak- 
ness of the “classical” approach, is vitiated by the implausible equa- 
tion used. Using the form of equation which a chemist might well 
choose for the data at hand, viz. У= (7-а) /P, and evaluating а from 
one graph to be 277 and k from the other plot to be 41.38 this chemist, 
innocent of least-square methods, would find that the yield at 800 mm, 
and 200 C can be quite closely predicted: 24.67 calculated, 25.14 ob- 
served. Such a chemist might well conclude that nothing has been 
shown to be wrong with conventionally controlled experiments. 

The discussion of Factorial Design opens with a droll definition of 
orthogonality: “Such a design is said to be completely orthogonal be- 
cause the design can be symbolized by squares or rectangles.” The idea 
of additivity of effects is not mentioned; the treatment of the comple- 
mentary concept, interaction, is somewhat distressing, two of the three 
“types” given being in error. 

“Interaction in experiments in chemistry” Gore writes, “usually can 
be classified into one of three types of mathematical functions or into 
а combination of these: 

(1) Hyperbolic (cross product) relationship’ between variables (e.g. 

Y=KPT or Y=KT/P). 
(2) Power function relationship (e.g., Y=K,P?+K,T?+ - . - ete.) 
(3) ы function relationship (e.g., Y - Kilog P+K; log Т 
иво.) 

i The accompanying discussion does not clarify or correct the impres- 
son given by these examples. There is no interaction between P and T 
in (2) or (3). 

The reason given for expecting large interactions in chemical experi- 


cial and biological sciences, The commonest reason for the existence of 


| 


STATISTICS IN CHEMICAL EXPERIMENTATION 483 


interactions is, of course, ignorance. When the experimenter does not 
know the form of the functional relation between his “factors” and his 
outcomes, or when the range of variation of some of the factors, and 
of the resultant outcomes, is so great that the equations he has used 
before break down, then he finds interactions. It appears likely that 
the author's belief in the frequent occurrence of interactions in chemi- 
cal experimentation is correlated with his failure to recommend ran- 
domization and with his consequent underestimation or error vari- 
ances. 

The next section is on the Estimation of Experimental Error, An ex- 
cellent example is provided by a set of data taken to evaluate a method 
of measuring the per cent weight-loss of vinyl polymers when heated 
at 260 C. Two samples, one stabilized and one not, were measured by 
two operators, after two times of heating, on two different days, in 
duplicate. Unfortunately (perhaps deliberately, but neither reasons nor 
consequences are given) the stabilized sample was examined on two 
days and the unstabilized sample on two other days. If the 32 meas- 
urements have been arranged in a fully balanced way, asa single replica- 
tion of a 4X23, then some judgment could have been made оп а matter 
that must only be assumed from the data given, namely that the differ- 
ence between samples was the same on different days. 

Noting that the two samples gave differing discrepancies between 
duplicates, the author reports two error mean squares, but only one 
sample X time interaction mean square (p. 71). He uses the larger error 
mean square to test the interaction and judges it significant. He con- 
cludes that “From these results it does not appear feasible to estimate 
a reliability for the analytic method which will be independent of the 
type of sample analysed.” i Я 

Inspection of the data makes it quite clear that the range of dupli- 
cate pairs for the unstabilized sample is about ten times that for the 
stabilized one. The ratio of the actual percentages of weight-loss is also 
about 10:1. It seems then that the coefficient of variation can be as- 
sumed constant. An analysis of variance of the logarithms of the values 
given permits eleven hypotheses to be tested (four on main effects, five 
on two-factor interactions, and two on three-factor interactions). All 
the mean squares for interactions are less than the error mean square, 
as are those for analysts and for days. Thus a set of simple conclusions, 
opposed to those drawn by the author, can be given. The precision of 
the method used, expressed as a coefficient of variation is 5%; this 
value holds for samples containing from 2 to 24 per cent volatile ma- 
terial. Secondly, the effect of increasing the time of heating from 30 
to 40 minutes is to increase the per cent weight-loss by a factor of about 
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1.22. (The true value of this factor lies with 95% certainty in the 
range 1.17 to 1.27.) This increase holds for both sample types, for 
either operator, and for all days. Finally, the ratio of results by the 
two operators does not differ with statistical significance from 1.00, ly- 
ing with 95% confidence in the range 0.96 to 1.04. The latter con- 
clusion holds for both types of sample, for either time of heating, and 
for all days. 

The failure of the text to give a clear discussion of error and of ran- 
domization as a necessary part of the statistical design of an experi- 
ment continues to plague later sections. The discussion of interaction 
and error, of replication, and of confounding are greatly weakened by 
this obscurity. 

Tn the opinion of this reviewer, the lengthy discussion of the use of 
2X2 Latin Squares is not of great value. In the only example given, in 
which a half-replicate of а 23 was run (three factors, each at two levels), 
it was decided to carry through the other half-replicate anyway. Most 
statisticians would have proposed the full 2 in the first place, especially 
in view of the author’s insistence on the wide prevalence of interactions 
in chemical experimentation. 

A fairly general treatmen? of linear hypotheses would have greatly 
simplified the presentation of the Analysis of Variance and of the De- 
sign of Experiments. The distinctions between the three broad classes 
of linear models in the analysis of variance (I, II, and Mixed) are not 
made, nor is a numerical example given of a components-of-variance 
problem, 

The chapter on correlation and regression follows the usual pattern, 
dealing first with linear regression on one sure variable, then with cur- 
vilinear regression, and finally with the multiple linear case. The for- 
mula given for the estimated standard error of the intercept of a 
straight line (8, page 131) is in error, being in fact the equation for the 
standard error of the mean value of y. The y-variate is wrongly referred 
to as the dependent variable, and the z-variate as the causal variable. 
The Assumptions underlying the derivations of the equations given are 
not indicated, nor is there any advice on how to handle linear regres- 
sion when one or more of these assumptions are not satisfied. The stand- 
ard references on these problems are not given. The improvement in 
fit due to shifting from Measurements to their logarithms (p. 136) is 
mistakenly judged by using Fisher’s z-transformation for comparing 
two independent estimates of the same population correlation coeffi- 
cient. Finally, the “95%, reliability limits for estimating Y,” (the value 
predicted from а multiple linear regression equation) is erroneously 
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given as а constant, not depending on the z-coordinate of the point 
(p.156). 

The last chapter is on Attribute Statistics. 

An Appendix includes five tables—the “normal distribution area,” 
some percentage points of the cumulative distributions of t, chi-square, 
F, and of the sample correlation coefficient. No acknowledgment is 
made or source indicated. 

An annotated bibliography of 14 works is given but it will be clear 
from the comments above that this reviewer does not attach great 
weight to the opinions offered. 


III. SUMMARY AND CONCLUSION 


There is a real need for а manual to aid chemists in applying statisti- 
cal methods to experimentation. The book under review fails to satisfy 
any part of that need. Some of its omissions are major ones; for ex- 
ample, operating characteristic curves, randomization, and several of 
the basic assumptions underlying analysis of variance and linear re- 
gression are not mentioned. Its errors of presentation are equally seri- 
ous. For example, the notions of orthogonality and of interaction are 
misdefined and misused; the distinction Between parameters and sta- 
tistics is repeatedly confused. 

The editors and publishers must share this adverse criticism with the 
author, since in many places the obscurity of the language used could 
easily have been rectified. 
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LIFE TESTING* 


BENJAMIN EPSTEIN AND MILTON SOoBELİ 
Wayne University 


I. INTRODUCTION 


E THIS paper we discuss statistical problems which arise when the 
observations become available in an ordered manner. Usually ob- 
servations made on a random variable do not become available in this 
way. ЇЇ n items are taken from a machine and measured for some char- 
acteristic such as diameter, it would be quite an anomaly and indeed a 
cause for concern if the first item taken from the machine had the small- 
est diameter; the second item, the second smallest diameter, etc. How- 
ever, there do exist numerous practical situations, for example, life 
testing, fatigue testing, and other kinds of destructive test situations, 
where the data do become available in this way. If » radio tubes are 
put through а life test, for example, then the weakest one fails first in 
time, the second weakest one fails next, etc. Indeed it seems fairly clear 
that observations will naturally occur in an ordered manner in life test 
situations whether we talk about the life of electric bulbs, life of radio 
tubes, life of ball bearings, life of various kinds of physical equipment, 
or length of life after some treatment performed on animals or human 
beings. There are still other situations, for example, testing the current 
needed to blow out a fuse, the voltage needed to break down a con- 
denser, the force needed to rupture some physical material, etc.,where 
observations become available in order if one arranges the test in such 
a way that every item in the sample is subjected to the same stimulus 
(current, voltage, stress, dosage, etc.) so that, the first weakest item 
fails, then the second weakest item fails, ete. 

Put in general terms, we test n items drawn at random from some 
Population and the data become available in such a way that the small- 
est observation comes first, the second smallest second, ... , and finally 
the largest observation last. Clearly we can, if we choose, discontinue 
experimentation after we have observed the first r failures in a life test. 
What are the advantages associated with the possibility of stopping 
before all n observations аге made? It seems that two principal ad- 
vantages stem from the fact that the observations occur in an ordered 


* The work described here has been carried out under an О! 
у се of Naval Research Contract. Some 
of the results were obtained at Stanford University in the summer of 1951. This paper is essentially 


486 


LIFE TESTING ш 487 


manner. These are that we may be able to reach a decision in a shorter 
time or with fewer observations than if we were to utilize а procedure 
which involves observing what happens to all the items under test (and 
thus in effect disregards the basic fact that information is being fed to’ 
usin an ordered manner). 


II. PREVIOUS LITERATURE 


The published literature dealing with the possibility of making use 
of order to reduce the time of experimentation or the number of obser- 
vations or both is, as far as we know, limited to three papers. In the 
first two of these papers the underlying distribution is assumed to be 
normal. The first paper was by Jacobson [7], who compares the operat- ' 
ing characteristic curves (for testing the mean of a normal distribution) 
of a test procedure based on the lowest 3 out of 5 observations with that 
based on the average of 5 out of 5, and 4 out of 4. He shows that the 
operating characteristic curve based on the average of the lowest 3 out 
of 5 observations is almost the same as one based on the average of 4 
out of 4 observations. He points out that in many cases the 2 out of 5 
items which have not been tested (since one stops the test after having 
observed the three smallest values) are for all practical purposes a8 
good as new and consequently one has gained as much from using up 3 
items as one could have by using 4. In the case of certain electrical 
tests, ordered observations such as Jacobson considers can be obtained 
by placing the items tested in a test panel so that they are all subjected 
to the same current or voltage, but are not destroyed simultaneously. 
Hence one can get additional information by simply placing new fuses 
or tubes in the panel as the old ones fail. J acobson considers only test 
panels of five sockets. The need for genera! ization to n sockets is clear. 

Тһе second published'paper is by Walsh [9]. The underlying assump- 
tion is in the main still the one involving normality and the values of 
r and n considered are very large (asymptotic theory). While Jacob- 
son’s main emphasis was on using order to cut down on the number of 
observations (presumably because each item destroyed is expensive), 
Walsh’s emphasis is both on the time-saving and observation-saving 
possibilities associated with using order. 

The third paper is by Halperin [6], who assumes only that the under- 
lying probability density function, /(а; 0), is subject to certain mild reg- 
ularity conditions. Again one is dealing with the asymptotic situation 
where r and n are large. The principal result is that 6, the maximum 
likelihood estimate of 0, is consistent, asymptotically normally dis- 


tributed, and of minimum variance for large samples. A general ex- 
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pression is given for the variance of the asymptotic distribution of 9. 
Small sample estimation of 0 is considered for two special cases, one of 
these being the exponential density. As we shall see, this bears a close 
relationship to the present paper. 


ШІ. SOME RESULTS IN THE EXPONENTIAL CASE 


When we started our work on this general problem one possible first 
problem was to generalize Jacobson’s results in the normal case for 
various combinations of r and! n. After some consideration and after 
discussion with electronics experts, however, we decided to turn our 
initial efforts to ordered observations drawn from non-normal distri- 
butions. Specifically, we decided to study the case where the character- 
istic X being investigated has an exponential distribution with a den- 
sity f(x; 6) of the form 


1 
(1) Л; 0) = redit 6 0,z 0. 


If z is considered as life in hours, it appears that by choosing 0 suitably, 
one can fit reasonably well the distribution of life for many types of 
electronie tubes. While this assumption will ultimately need investiga- 
tion (since perhaps the distributions are more nearly type III or some 
other skewed form) let us see what can be said if the density f(x; 0) is 
really of this exponential form. Speaking in physical terms, we may say 
that 0 is just the average life in hours since 


Q) E(X) = few = 6. 
0 


The first questions asked were, Suppose n items are drawn from a dis- 
tribution with a density of the form Ха; 0) = (1/0)e-7?, Suppose the 
observations! become available in order, ie, „52.5 -:· EX 


Oy BE Фа,» Suppose experimentation is discontinued after the first r 
observations are made. Then 


(a) What is 9..2, the maximum likelihood estimate of 6? 
(b) What is the distribution of Brin? 


Without going into detail at this point, we assert that 
(8) à a Eos а, Ў 
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Further, й,„ is unbiased and has a chi-square distribution? which de- 
pends only оп г and not on n and is, in fact, identical with the distribu- 
tion of 0... From the point of view of estimation, this means that the 
estimate (3) has exactly the same precision as does 

à diss Bir F Tor Eme 

a 

the average of r out of r observations. From the point of view of accept- 
ance testing, the operating characteristic curve based on the lowest r 
out of n ordered observations (acceptance region of the form 8, „>С 
ог б, „< C)? is identical with that based on all r out of r observations. 
Detailed proofs of the statements just made are given in Section 1 of 
the Appendix. 


IV. SOME REMARKS ON THE TIME SAVING FEATURE OF THE TEST 


If we assume, as we may in some cases, that the (n—r) untested items 
are essentially as good as new, then we are clearly in a situation where 
taking the lowest r out of n observations uses up the same number of 
items as taking all r out of r. What, then, is the justification for using 
the first procedure as against the second pfocedure? The answer is that 
the only justification is to save time. For instance, a test procedure 
which involves taking the smaller of two random observations will lead 
to a test whose operating characteristic curve is the same as that found 
in observing 1 out of 1. However, the expected length of time for the 
first procedure is only one-half that for the second procedure. Conse- 
quently, the procedure involving the use of the smaller of two observa- 
tions and stopping there is to be preferred to the one involving taking 
just one observation at random, df the saving in time outweighs the 
loss due to testing two items rather than one. Even if the (n—7) items 
are not as good as new, it might be of critical importance to be able to 
come to a decision quickly. The decision might involve the disposition 
of thousands of items and the possibility of coming to a decision quickly 
without increasing the risks of making a wrong decision might well be 
worth the cost of (n—r) additional items. 

Let E(X, „) be the expected length of time needed to observe the first 
r out of n ordered observations, and let Р(Х, „) be the expected length 
tag ase uo. Le 

Lore redire od edi seems in a ortin 


i i 4 ii intuiti: reasonable and is indeed theoretically 
quantity which can be computed in advance, it seems intuit ively Hick aldo АУЛ; on e ot 


sound, to accept the hypothesis that the true mean life is some desired high v: о 
hand, rn «C, we accept the alternative hypothesis that the true mean life is some low (undesired) 
value б, This question is discussed in some detail in Section 3 of the Appendix. 
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of time needed to observe all r out of r. Then the ratio Е(Х,..)/Е(Х,,) 
is a measure of the expected saving in time due to using the first pro- 
cedure as compared with the second procedure.‘ In Table 1 we give the 
values of this ratio for selected small values of r and n. This table shows 
that if “time is money," procedures which use ordered observations may 
be very advantageous. 

TABLE 1 


RATIO OF THE EXPECTED WAITING TIME TO OBSERVE 
THE r'TH FAILURE IN SAMPLES OF SIZE 
т AND т RESPECTIVELY 
Е(Х,„)/Е(Х,„) = о 


1 1 50 83 25 20 10 067 050 
2 -- if 56 39.8014 092 068 
3 = = 1 59 42 18 12 087 
4 eA Ses — 1 .62 23 14 104 
5 ке нса — — 1 .28 118 125 
10 Кт n - - — 1 .35 23 


У. А TEST PROCEDURE 


The next question we ask is how to find a test procedure which 
will approximate 8 prescribed operating characteristic curve. Put in 
Statistical terms, we want to test the hypothesis Но:0 = 01 against the 
alternative H1: —6; «6. It turns out that our zule of action should be: 
accept Но if ĝ, „>С and reject Ho if 0, „< С. In particular, how do we 
find r and C if we require that the operating characteristic curve shall 
be such that for 0—6,, L(0,) = Pr (accept 0—6; given that 0, is true) 
=1—а and for 0—6, 70) = Pr (accept 8 =й, given that 0; is true) SB? 
« and 8 may be thought of as errors of the first and second kind or as 
producer’s and consumer’s risks, respectively. It turns out to be pos- 


* For example, suppose that it takes on the a: i i : 
y verage Ty hours until the testing of АП r out of т items 
б ре Suppose that E(X;,») /E(Xr,,) =ar,n. Then the expected length of time required for the 
failure of the first т out of nis given by ar. Tr. It can be shown that 


Bn) ОИ de fei =e, yo: 
e 


The formula for E(X.) and other pertinent results are derived in Section 2 of the Appendix. 
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sible to find r and C, given 61, 0 (really the ratio 65/0; is all that is 
needed) and o, 8. The computation can be greatly simplified for ве- 
lected values of o and £, by using the results and tables contained іп а 
paper by Eisenhart [3]. Also of use in this connection are the results 
and operating characteristie curves given in a paper by Ferris, Grubbs, 
and Weaver [4]. A detailed treatment of how to find a “best” test and 
compute its operating characteristic curve is given in section 3 of the 
Appendix. 
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VI. AN EXAMPLE 


For example if 0/0; —3 and «= 8 = .05 it is easy to show that a suita- 
ble r to use is 10. If 0,/%=3 and a=.10 and 8—.05, then the proper r 
is 8. Similarly if 6,/6;—3 and a=.05 and 8—.10, then the proper r is 
8. This means, for instance, that if we want the test procedure to ac- 
cept a lot whose average life is б, = 1500 hours 95 per cent of the time, 
and to accept а lot whose average life is 6; — 500 hours only 5 per cent 
of the time, then a possible procedure is to observe 21,5, 22, * * * ) 10, 
the first 10 among n items (п> 10), and if б,„> 814 hours accept 0-0, 
Иб,» « 814 hours accept 0 = 6. Such a procedure will have an operat- 
ing characteristic curve for which L(6:) = 95 and L(6.) € .05.5 It should 
be noted that n, the number of items tested, is left arbitrary. Tf one’s 
object is to reduce testing time, then it is clearly advisable from Table 1 
to make more than 10. 


VIL A TEST PROCEDURE BASED ONLY ON Хул 


We should like to raise the possibility of another kind of decision 
rule which is simpler to state and to apply, and which has a power curve 
which coincides for all practical purposes with one based on 6.,„. То вее 
the motivation behind this procedure, we examine equation (3). It will 
be noted that in (3) £+,» is weighted more heavily (if <n) than are the 
earlier observations Zi, Z2,» ' У, Orin This naturally raises the 
question of how much one loses (for example, choosing between 0; and 
0, (0, <61)) if only =, » is used in making a decision. We know that for 
given 6/6, о, В one can find an acceptance region for 6; of the form: 
Accept 6, if 6-.n> бі (reject otherwise). The question is whether for the 
ваше т one can choose n sufficiently large and a suitable C2 such that 
the rule: Accept 0, if т, „> C», reject otherwise, has an operating char- 
acteristic curve which is for all practical purposes coincident with the 
one based on the гше, „>01? This is possible and л need not be much 


$ By actual computation 2/0) is in this case equal to .048. 
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needed to take the first 10 out of n>10 observations if we were to base 
our decision on the value of бю, and get the required operating char- 
acteristic curve. It turns out that if п > 14, just as good a decision rule 
(in the sense of giving the same operating characteristic curve as one 
based on 610,n) is to use just zio,» the tenth value in a sample of size n. 
If п> 14 and we are committed to taking the first 10 observations and 
then stopping and making a decision, it appears that little information 
is lost by forgetting about the nine earlier observations лу, 24,4) - - · y 
29, 

As a specific example, we go back to testing 6; = 1500 against 6; = 500 
with (01) =1—о= .95 and (0) <8--.05. Let our test now be based 
just on 210,20 the 10th smallest observation in a sample of size 20. Then 
it can be shown that the decision rule: Accept 0 — 6, if 210,207» 540 hours, 
reject otherwise (i.e., accept 0) yields an operating characteristic curve 
with the prescribed a, 8. This can be put another way: If the 10th ob- 
servation among 20 ordered observations occurs before 540 hours, ac- 
cept 0,— 500 hours (i.e., reject 6; = 1500 hours) jif the 10th observation 
appears after 540 hours accept 6;— 1500 hours. Clearly, in the latter 
event, we would stop at 540 hours and not go on since it is a fortiori 
true that 29502540 if the 10th observation has not occurred by 540 
hours. Thus, the idea of using just the tenth observation leads in а most 
natural way to a consideration of truncated procedures. 

The possibility of truncation arises from the fact that information 
becomes available in an ordered manner. The possibility of truncating 
even before reaching the rth observation out of n (for example, the pos- 
sibility of developing sequential procedures in problems where the data 
arise in an ordered manner) now faces us squarely. It seems fairly evi- 
dent that we need to examine the gains (either in average time neces- 
sary to make a decision or in the average nuraber of items destroyed) 
attainable by truncated and sequential procedures. It also seems clear 
that as the theory develops, the whole problem should be considered in 
the light of decision theory. е 

It appears on the surface that our results are rather specialized since 
they were obtained for densities of the form Ха; 0) 21/0 e=, n0, 
9>0. However, similar results are valid in a wider class of situations.* 


5 noe of the results achieved for the density function (1/0)6-2/0 also hold for the cumulative dis- 
lon 
Ра) «1-6%Ф0%, 2590 (6>0 
-0, Бы 4 


Where g(x) is а striotly increasing function оГ оғ 220 with у; 
г б g (0) -0 and g (о) =. For example the 
maximum likelihood estimate of 0 is then given by id 


E (е) +++ co(z) + (n — role) 
bnn = —— 


r 
As ^ case in point, we may mention the Weibull distribution where g(z) =a? (b а known constant). 
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It is known, for example, that in life and fatigue testing, skewed distri- 
butions such as the type IIT, logarithmico-normal, and the Weibull dis- 
tribution (named after the Swedish physicist W. Weibull) are useful in 
fitting data. While we do not, at this stage of our research, have proofs, 
we think that weare justified in stating that for the skewed distributions 
enumerated above, the utilization of the first r out of n ordered ob- 
servations and in fact just the rth out of n ordered observations will 
give decision rules which will have at least as good power as those in 
current use and will save either time or items destroyed or both. 

In this portion of the paper we have tried to stress some of the under- 
lying ideas and the potential value of the proposed methods without 
giving any mathematical details. Some details are given in the Ap- 
pendix. 


APPENDIX 


1. Derivation of a “Best” Estimate Based on the First т out of n Ordered 
Observations Drawn from an Exponential Distribution. 


Let the following assumptions be made: 

(i) n items are drawn at random from æ density of the form Ха; 0 
=1/0 e", 270,020; 

(ii) the observations become available in order so that 11.12,» 
3... 52,35 +++ тал Where by 2,(15%5т) is meant the ith 
smallest observation in a sample of ordered observations; 

Gii) the experiment is discontinued as soon as tr,» has become availa- 
ble (i.e., after the first r observations are made). 

We wish under (i), (ii), and (iii) to find a “good” estimate of 0 and 
to give the distribution of this estimate. This objective is attained in 
the following theorem. * 

Theorem 1: Under (i), (ii), and (iii) an estimate based on the first r 
out of n ordered observations which is “best” in the sense that it is 
maximum likelihood, unbiased, minimum variance, efficient, and suf- 
ficient is given by 

... n nct Trn 
(1) РЕКЕ 2.2 
T 
The probability density function of 8, „1в given by 


(r/6)ty te", у>0 


Nos 
HW) = FH 
= 0, elsewhere. А . 


(2) 
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In order to show that 9, „аз given in (1) is the maximum likelihood 
estimate we write down the joint probability density function of the 
first r out of n ordered observations X1,,, Хз,„, - - - , X... This is given 
by 


n! Ace [ > паана ]/ o 
одана ia & 
(3) Ха», 22 UL ) Genie e 1 


OS Tin S Tan S 52, < о. 


It can be shown in the usual way that 0,,, аз given in (1) maximizes Í 
and is thus the maximum likelihood estimate. 

The sufficiency of the estimate can be verified at once by using a re- 
sult in Cramér [1, p. 488], since the density (3) can be written in the 
form 


(шл, Taim "5 26а 0) = 2(6,,», 8)h(zis, 22, * * y Trin) 
where 
ТИ 052.5... San < о 


Ка» any 58,2) = 
Ranma ЕЛЫ О: otherwise: 


We defer the proof that б, „ is efficient, unbiased, and minimum vari- 
ance until we show that the probability density function is given by 
(2). This we now do. 

Instead of the random variables X;,,, let us introduce г new random 
variables Y;,, 1x7 Sr, where 


(4) Ta Xin and Yin m Xin = Жил», ISIS 


We shall now prove Lemma 1. a 
‚ Lemma 1: The random variables Y;,, defined by (4) are mutually in- 

dependent. Further, for each 1, (п-4--1)У;, is distributed with com- 
mon density 1/0 e-!, 20, and each Y;,, сап be considered as a ran- 
dom variable which is the smallest value in а random sample of size 
(n. —i--1) drawn from the parent density function. 

Proof: The joint probability density function of the У; „ [obtained 
from (3)] is : 
(5) обл, Yam yeu) = nb inn È (n —it Dy. /0 

(n—r))e = 

where 0 Yin<o, 1-1,2... 5 

Rewriting (5) as 
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хе 
(6) gin, Уату °° * Yren) = ne) g- THD ij lo 


i=l 


clearly establishes Lemma 1. 
Rewriting в, in terms of the yi,n, (1) becomes 


(7) [Er =; »X: (n Е Dyos/r. 
i=l 
Since the characteristic function of the density (1/0)e-*/* is given by 
9. (1) = (1—00) 1, it follows at once from Lemma 1 that 


> д-т 
(8) PSI or ( idi Yo 
i=l y 


From the uniqueness theorem for characteristic functions, it follows on 
inversion that the probability density function of 9, ,, is given by. , 


1 
АЫ тутту 0 
(9) 70) @ р! (r/0)ryrte v^, у> 


= 0, elsewhere 


This establishes (2) of Theorem 1. а j 
То complete the proof of Theorem 1, we show that б, 15 unbiased, 
efficient,’ and has minimum variance. The unbiasedness of Îr,» is à con- 
sequence of the fact that Е(0,..) =/7 yf. )dy-?. 
For efficiency and minimum variance let us compute the Cramér-Rao 
lower bound 1/E(9 log 7/90), where f is given by (3). But (3) can be 
rewritten as р . 


КН RE L aan, where C = nl/(n — 7)! 


Thus 
T 
(11) log = log C — r log 6 — ny 
and 
à logf T P 
12 p — бп 
ag 90 0 x 8? 


? For the definition of efficiency see [1], р. 481. 
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Thus 
д log fA? т тау 
diet ыы ас аға 
«( 99 ? ( 0 ud 6: 
1 PA [2:33 
= rel ——8,„ + E 
(13) е в a 
á 1-24 1 (5 ap 2! 

-z BNF 
= r/0?. 


Hence the Cramér-Rao lower bound is er. 

But Var (б, „) = / у? fi(y)dy —0*—0?/r and since the assumptions 
needed for the derivation of the Cramér-Rao lower bound are clearly 
met in our problem, ĝ,,„ is minimum variance and efficient since any 
other estimate must have variance at least equal to 6?/r. Thus Theorem 
1 is completely established. 


2. Distribution, Expectation and Variance of Х,,, 


The random variable X,,, can be interpreted as the waiting time to 
get the rth failure in a sample of size n. The probability density func- 
tion of Х, „18 given by 


n! ет (nr tl) 2/0 
(т — 1)!(% — r)! 0 


One can easily find E(X r.n), the expected waiting time, directly from 
Л 29, (z)dz. This gives the result 


(15) Qt.) e (^?) (5 (EU Vo -r+k+1)]. 


A much simpler formula for E (X...) will now be established. To do so 
we write X,,, as 


(М) фа) = [1 = е], г > 0. 


M Xeni = Xo X ур... Е ОС. — X25): 
Hence from (16) and (4) it follows that 


ay Pec E AN 
il 


Thus by Lemma 1 
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(18) Е(Х,„) = У (Yi) = 0D 1/0 —j + 1). 
i=l j=l 
Also from Lemma 1 it follows that 
(19) Var (Xen) = 02 1/(n =) + 1)*. 
j=l 


It is left to the reader to verify the identities obtained by equating 
the right hand sides of (15) and (18). It should also be pointed out that 
in [5], p. 324, Gumbel found formulas (18) and (19) by using the mo- 
ment generating function of X; s. 


3. Derivation of a “Best” Test Based on the First т out of n ordered 

Observations Drawn from an Exponential Distribution 

In this section we study the question of how “best” to use the first 
r ordered observations (from a sample of size n) so as to decide between 
two values of 0, 6; and 62 (where 6:>62). By “best” we mean according 
to the usual Neyman-Pearson terminology a test which has the prop- 
erty that among all tests having а fixed probability а (size) of reject- 
ing 0-0 when true, the test in question will have the largest possible 
chance of rejecting 0 —6; when the alternative 0 = в is true. 

To derive the “best” test we use the Neyman-Pearson lemma (see, 
e.g., [1], pp. 529-531). According to this lemma a “best” test must be 
one for which the region of rejection can be found from the inequality 


(20) Ла Tom * утта) В) / та Tarm © ^^) Tami 0) > К. 
From (1) and (3) this becomes 
(21) (Sec > К. 

[5 


Since 6, and 0; are preassigned constants such that (1/0) — (1/61) > 0, 
it follows at once that the region of rejection for 0 —6: has the form 


(22) 8, < C. 


То meet the condition that the probability of rejecting 0 —6; when true 
equals o, we need to choose C so that 
(23) Pr Onn < C| 0 = 0) =a 


To find C explicitly we use Theorem 1, which states that 6r has 
(2) as its probability density function. From this it is very easy to 
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verify that W —2r8,.,/0 is a random variable which is distributed as 
chi-square with 2r degrees of freedom. Thus (23) can be rewritten as 


2 
(24) T (w « ==) = a ог equivalently аз 
1 


2r 
(25) в(т> 29) -i- 
8, 


Let us denote a chi-square variable with n degrees of freedom as х?(п) 
and let us define the constant x,?(n) by the equality 


Pr (x'(n) > xy'(n)) = v. 
"Thus (25) ean be written as 
(26) С = &x1-27(2r)/2r, 


Hence (28) will be satisfied if the region of rejection for 0--б) is given 
by 


(22) Brn < бха-%(2ғ)/2ғ. 


According to the Neymar.-Pearson lemma the region of rejection (22) 
has а greater chance of rejecting 0 = 0; when б = 6, is true than any other 
region which assigns probability о to the rejection of 0-0) (when 6; 
is the true value). Evidently the region (22) does not depend on the 
particular choice of alternative 62. The region (22) is “best” in the Ney- 
man-Pearson sense for апу 6,«6,. Hence (22) gives a uniformly most 
Powerful test in the Neyman-Pearson sense of the hypothesis 0-6, 
against 0 <0. 

, Tt is convenient in what follows to use acceptance rather than rejec- 
tion regions. Consequently the Ni eyman-Pearson theory tells us that a 
simple test for 0 —6, against 0 <4, with Type I error =œ is given by an 
acceptance region of the form 


(99) ГА > 6pa-4'(2r)/2r. 


Let us now look at the operating characteristic curve of a procedure 
specified by (227), Le., let us study 


L(0) — Probability of accepting 0—6, when 0 is the true value 
(0) -Р(>Фжа (020) 00) 
= Pr Gé(2r) > бла (9ғ)/0) 
since 2ró,., /0 is distributed as x?(2r) when біз the true value. The graph 


—: 
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of L(0) for various values of r and of the ratio 01/0 where a=.05 is given 
in Figure 1. 


L(6)* PROBABILITY OF ACCEPTING 6*0, WHEN © 15 THE TRUE VALUE 


5 1, Operating characteristics of е) of the Гога 
Bq PS Ц8)-1<<.% 

In the problem just discussed it was assumed that r and a are known 
and C is unknown. We shall now consider a problem where both rand 
C are initially unknown. We want to choose these unknowns in such 
a way that the resulting operating characteristic curve will have the 
property that 
(28) 1060) =1—a and L(&) ЕВ, 


where 0, «6, and а and fare prescribed in advance. To meet бо 
(28) means substituting @ for 0 in (27) and requiring that r be sue 
that | 


(29) © yet) mor) or б/б € xi (2r)/xe (2r). 


Knowing (29) makes it an easy matter to find that integer r which ue 
sures that the operating characteristic curve pass most nearly e 3 
the points |, L(0,)=1—a] and [№ 200) 28]. It can be verified tha 
as r goes through the values 1, 2, 3, * * * ; the ratio xi Qr) /ха?(27) is 
Strictly inereasing, and it is easy to show that it tends to unity. Conse- 
quently there is а smallest integer т such that 


(30) Ха? (2r) / ха? (2r) > 06/0. 
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TABLE II 


VALUES OF r AND ACCEPTANCE REGIONS FOR FIXED o, 8 WHERE 
«PROBABILITY OF REJECTING & WHEN 0-0; 8B=PROBABILITY 
OF ACCEPTING в WHEN 0-0, AND WHERE 0, >b. 
ACCEPTANCE REGION IS OF FORM 6,,>С 


6, / 02 T C/Q С%/б T C/o, C*/a r C/à C*/4 
a=.01 
B= .01 B=.05 В-.10 
8/2 136 .8114 .8068 | 101 .7831 .7794 | 83 .7625 .7620 
2 46 .6892 .6873 | 35 .6492 .6466 | 30 .6247 .6220 
5/2 27 .6073 .6005 | 21 .5631 .5536 | 18 .5342 .5246 
8 19 .5445 .5365 | 15 .4985 .4864 | 13 .4692 .4559 
4 12 .4523 .4477 | 10 .4130 .3926 | 9 .3897 .3610 
5 9 .3897 .3807 8 .3633 .3287 | 7 .3329 .3009 
10 5 .2558 .2321 4 .2058 .1938 | 4 .2058 .1670 
а= .05 
В= .01 8-.05 B=.10 
3/2 95 .8374 .8360 | 67 .8079 .8059 | 55 .7890 .7841 
2 88 .7319 .7244 | 23 .6834 .6830 | 19 .6548 .6515 
5/2 19 .6548 .6488 | 14 .6046 .5905 | 11 .5608 .5602 
3 13 .5915 .5852 | 10 .5426 .5235 | 8 .4976 .4905 
4 9 .5217 .4834 7 .4694 .4230 6 .4355 .3804 
5 7 .4694 .4163 5 .3940 .3661 | 4 .3416 .3341 
10 4 .3416 .2511 3 .2725 .2099 | 3 .2725 .1774 
a=.10 
В= .01 В = .05 В = .01 
3/2 77 .8570 .8560 | 52 .8269 .8239 | 41 .8058 .8031 
2 26 .7583 .7559 | 18 .7123 .7084 | 15 .6866 .6710 
5/2 15 .6866 .6785 | 11 .6383 .6168 | 9 .6036 .5775 
3 11 .6383 .6104 8 .5820 :5478 | 6 .5253 .5153 
4 7 .5564 .5204 5 .4865 .4577 | 4 .4363 .4176 
5 5 .4865 .4642 4 .4363 .3877 | 3 .3673 .3548 
10 3 .3673 .2802 2 .2660 .2372 | 2 .2660 .1945 


For the acceptance region Gn>C*, L(0)21—a and 1/6) =. 
For the acceptance region 97,5 >С, L(0:) =1 —a and 10) <В. 
For any С” such that C* <C’ <C, the acceptance region Gn >С” has L(0:) >1—a and 1/0) <В. 


This is the value of 7 which we wish to use. If, with this value of r, we 
use an acceptance region 6 —6, of the form, 


(31) %,>С where C= хі s? (2r)/2r 


LIFE TESTING 501 


we shall have a test whose operating characteristic curve is such that 
І(6)-1-а and 16) <8. [Incidentally, a region of acceptance for 
0=0, of the form 8,,,7 C* where C* —6; x*(2r)/2r will give for the same 
т ап operating curve such that L(6;) $1— o and 1/0) = 8.] 

In summary, we have shown that given а, В and the ratio 6:/0s it is 
possible to find an г and C and a region of acceptance for 0—6; of the 
form 8, „>С such that L(6;)=1—a and L(6:) S8. The computations 
for т «100 were made using a table of x?, particularly as extended by 
Catherine Thompson [8], and a Bureau of Standards compilation [2]. 
For r 2100, the Fisher form of the normal approximation to x? with 27 
degrees of freedom was used in computing Хь «?(2т) and x,2(2r). For 
certain selected values of о and В the computations can be further sim- 
plified by using [3], and [4]. In Table 2 we give test procedures (i.e., 
regions of acceptance) for various selected values of a and 8 and the 
ratio 01/02. It is evident from this table how we obtained the numerical 
values for the special example given in the expository part of this paper. 

Remark: One can in a completely analogous way, find a uniformly 
most powerful test in the Neyman-Pearson sense for testing 0 =01 
against the one-sided class of alternatives @>61. In this case the region 
of acceptance for 0=0; is of the form 9, « К where К is such that 
Рив, „< К| б true) =1—a. 


4. Truncated Tests 


Mathematical details for truncated and sequential tests will be given 
in other publications. 
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The problem discussed is that of comparing the longevities 
of two.or more types of equipment under operational condi- 
tions where it is not convenient to identify or keep records of 
individual items. Such a comparison can be made by adopting 
certain replacement policies and observing their effect on the 
composition of the population. Methods of estimating relative 
and absolute longevities are given for the case where k types 
of equipment are being compared and various logistical re- 
quirements are placed upon the replacement policies. Methods 
of making decisions and testing hypotheses concerning the rel- 
айуе and absolute longevities are also given. Replacement 
policies are given which, under certain conditions, are opti- 
mum for purposes of studying longevity. 


1. INTRODUCTION AND SUMMARY 
1.1. Introduction 


Nd of the longevities of two or more types of equipment 
under operational conditions where it is not convenient to identify 
or keep records of individual items can be made by adopting a certain 
replacement policy and observing its effect on the composition of the 
population. When only two types are being compared, for example, the 
policy might be that when an item fails it will be replaced by one of the 
opposite type. Then the composition of the population at any time 
(that is, the proportions of the different types among all the items in 
use) will depend upon the original composition of the population, the 
time elapsed, and the longevities of the different types. Since the orig- 
inal composition and the elapsed time are known, by determining the 
new composition of the population we can obtain information concern- 
ing the longevities of the different types of equipment. 


1.2. Summary 


The problem of comparing two types of equipment is analyzed in 
detail when it is assumed that the equipment is subject to a constant 
risk. Later we see that if a replacement policy is used for a long period 
of time, the results obtained under the assumption of a constant risk 
remain valid even when the risk is not constant. A type of replacement 
policy is investigated for which approximately equal numbers of units 
of each type of equipment, are used as replacements. An example of 
such a replacement policy is: Half of the replacements will be of one 
type of equipment, and half of the other. This “50-50” policy is de- 
scribed by J. L. Glathart and Е. W. Preston [4] and attributed to 
Merrill M. Flood. Another replacement policy of this type is: When an 
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item fails, replace it by one of the opposite type (*switch"). We shall 
see that this “switch” policy has certain advantages over all other 
policies of the type investigated. Under certain conditions, for example, · 
the estimates of the relative longevities when using a switch policy are 
less biased and have a smaller mean square error than the estimates 
obtained under other polieies of the kind investigated. Correspond- 
ingly, the power of tests of various hypotheses concerning the relative 
longevities is greater when the switch policy is used. 

Types of replacement policies which satisfy different logistical те- 
quirements are also investigated. For example, it might be necessary 
to use as replacements only half as many items of one type as of the 
other. Methods of estimating and testing hypotheses concerning the 
relative longevities are given which may be adopted when these re- 
placement policies are in use. 

If the replacement policies have been used for a long time, only 
estimates of population composition are needed for estimating or for 
testing hypotheses concerning relative longevities. If information 
about the stock is also available (i.e., knowledge of the total number 
of items that have been replaced), we can estimate the absolute longevi- 
ties of the individual types of equipments and correspondingly, hy- 
potheses concerning the absolute longevities can be tested. A numerical 
illustration is presented. 

Methods of estimating relative and absolute longevities are given 
also for the case where we wish to compare k types of equipment and 
where various logistical requirements are placed upon the replacement 
rules. 

The work presented herein may be considered a special application 
of renewal theory and the theory of Markov chains. An excellent exposi- 
tion of these theories is given by Feller in |8), Chapters 12, 13, and 15. 

2. COMPARING TWO TYPES OF EQUIPMENT 
2.1; А Symmetric Type of Replacement Policy 

2.1.1. Definition of the symmetric replacement policy. Consider the 
following type of replacement policy: When an item fails, the probabil- 
ity is p (assumed to be greater than zero) that its replacement will be 
of the opposite type. That is, about 100p% of the replacements will be 
of the type different from the item that failed and about 100(1--р)% of 
the replacements will be of the same type аз the failure. For example, 
when p=1 we have a “switch” policy, and when p-1/2 we have a 
“50-50” policy. We shall first consider the case where inspections are 
made at periodic intervals, at which time the items found to have 
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failed are replaced. In Section 2.1.12 the case of instantaneous replace- 
ment upon failure is considered. 

2.1.2. Population and sample composition. Let f; be the probability 
(assumed to be greater than zero) that an item of type % (2-1 or 2), 
which had not failed at time 4 (say, the third week), will have failed 
by the next time % it is inspected (say, the fourth week). We write 
1—fi=s; for the probability that an item of type i will survive the 
entire period between inspections. Then the probability is з! f; that 
ап item of type 2 will be found on the zth inspection to have failed 
since the (x—1)th inspection. The mean length of life for items of 
type čis ^' 


@ 5 аел ОК, Li. 


That is, the length of life of items of type i has a negative binomial 


distribution where the longevity is equal to 1/f;— Г. If a replacement ` 


policy is used for a long period of time, the results obtained when the 
length of life has a negative binomial distribution with mean L; will 
remain valid also in cases where the distribution of length of life has 
mean L; but is not negative binomial (see Section 5). 
Consider the effect of adopting the symmetric replacement rule de- 
; fined in Section 2.1.1. upon the population composition. For simplicity, 
we consider the case where the population was initially composed of N 
items of each type of equipment, If an item is drawn at random from 
the 2N items of which the population is composed at the zth inspec- 
tion, the probability that it is of type 1 is 


1 ГІ. 
DARPA jp). = In/(la + I) 32 
@ Р.Р) ы + Ia) + [7 


2— Li 
224 [1 — рл +). 


If 2n items are drawn at random from the 2N items of which the popu- 
lation is composed at the zth inspection, then the number n of items 
of type 1 has a binomial distribution with parameters 2n and Pr.{ 1| р} . 
That is, the chance that n; items in the sample will be of type 1 is 

s [Pra(1| p) J" [Pr:(2| p) Ps, where „= yt/[2Y(y —2)!]. Hence the 
expected value of the proportion nı/2n of items of type 1 among the 
sample of 2n items from the population composition at the zth in- 


Spection is 
(8) Enn | р), = Pr.Q | p). 
The number of items of type 1 among the 2N items of which the popu- 


Я 


| 
: 
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lation is composed at the zth inspection has a binomial distribution 
with parameters 2N and Pr. |p). (See Appendix А1. for proofs of 
statements presented in this section.) 

2.1.8. Estimating relative longevities. By simply observing the com- 
position of a sample of 2n items which are іп use at the sth inspection 
we can obtain an estimate of the relative longevities of the types of 
equipment. That is, the proportion n,/2n of items of type 1 is an es- 
timate of the relative longevity Li/(Li4-L»). The bias of the estimate 
n,/2n of the relative longevity is 


B(p) = Евы] р) — ыы + №) 


С) fl- Is i 
= s: о, 


which becomes negligible when the replacement policy is in use for 
a long period of time. We shall deal with the case where 1-51. In 
this case the absolute bias | В(р)| is a monotonically decreasing func- 
tion of p, and, therefore, is minimized when p=1. That is, among all 
replacement policies of the symmetric type investigated, the switch 
policy leads to estimates which are least biased. Also, the bias ap- 
proaches zero most rapidly when the estimate of the relative longevity 
is made from data resulting from the switch policy 
The mean square error of the estimate n1/2n is 


b Е, [m/2n — In/(In + La) | Р} 
= Prid | p)Pra(2| p)/2n + В). 


Among all replacement policies of the type investigated, the switch 

policy leads to estimates which һауе the smallest mean square error 

(See Appendix A2). Also, the mean square error approaches Lil, 

/ [34-5 2n] most rapidly when the estimate of relative longevity 

is made from data resulting from the switch policy. : 
Since the mean square error 18 approximately 


alta)” 


een used for a long period of time, 
the sample composi- 


when the replacement policy has bi 
the mean square error can also be estimated from 
tion. An estimate of the mean square error 18 


5 F а /2n = патз/ 81. 
2n 2n 
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Ап estimate of the square root of the mean Square error is 
jx pneu 

(6) — утап. 
2n 


2.1.4. Making decisions and lesting hypotheses concerning relative 
longevities. Consider the problem of choosing the type of equipment 
which has the greater longevity. A reasonable symmetrie procedure 
would be to choose type 1 if n Xd and to choose type 2 if nı <d, where 
d «n is a constant to be determined by the amount of indecision we 
are willing to permit. (When'd — n — 1 there will be indecision only when 
the number of type 1 items exactly equals the number of type 2 items 
in the sample of 2n items from the population composition at the xth 
inspection.) The probability of making an incorrect decision, say, of 
choosing type 2 when in fact type 1 has the greater longevity is 


O P= Prim sd Ў бы“ [Рг.(1 | 2) "+ [Рг.(2 | p) |”, 


пү=0 


where Pr.(1|p)>4. (If Li Га, then Pr.(1|p) 3 when 2>0.) A proof 
of the fact that P is a decreasing function of Pr,(1 |р) is given іп Ap- 
pendix A3. Also Pr,(1|p) is an increasing function of p (see Appendix 
Al). Whence, P is a decreasing function of p and is minimized when 
p=1. That is, among all replacement policies of the type investigated, 
the switch policy minimizes the probability of making an incorrect 
decision as to which type of equipment has the greater longevity. The 
switch policy also maximizes the probability of making a correct choice 
of equipment. Also, P approaches 


d 
(8) > C, PL Lamy (Ly + Туз" 


ті-0 


most rapidly when the switch policy is used. 

Тһе switch policy also has similar optimum properties in the case 
where, say, the hypothesis Н that Тл =I, is to be tested at a level of 
significance a against the alternative that L1> Ls. Then d is the smallest 
integer such that 

d 
M С" 9-2% >1- a, 


тү-0 


and H is accepted When n; Xd and rejected otherwise. 
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When 2n is large, then 
(9) [m — 2пРг.(1 | D /v2nPr.Q | p)Pr.| р) = v 


is approximately normally distributed with zero mean and unit vari- 
ance, and, hence 


P = Pr.(m < d) 
~ Pr(y < [d — 2nPr.(1| p)]/-V2nPr.(1 | p)Pr-(2| р) }. 


This formula can be used by the experimenter to determine the sample 
size 2n and the number z of inspections which will be necessary in 
order to guarantee that the chance of an incorrect choice of equipment 
(or an error of the second kind) will be preassigned amount P. 

The test discussed herein for the hypothesis that Li=Zz against 
the alternative that Lı> Г» may be considered a test of the hypothesis 
that the proportion of type 1 items among the 2N items of which the 
population is composed at ће zth inspection is $ against the alternate 
hypothesis that this proportion is greater than 3. In other words, the 
test discussed herein is essentially a test of the mean of a binomial 
distribution. Sequential methods have been developed for testing the 
mean of a binomial distribution (see [16], Chapter 5). A detailed and 
nonmathematical discussion of this problem, together with a number 
of tables, charts, and computational simplifications, is contained in 
[5]. If the population is composed of many items (2N is large) and the 
replacement policy has been in use for a long period of time (x is large), 
the sequential method can be directly used to test hypotheses concern- 
ing the relative longevities. For example, if the null hypothesis is that 
Li/(Li-- 11) =.1 and the alternate hypothesis is that Li (Lr +a) =.3, 
the acceptance and rejection nunibers which describe the sequential 
test are given in Table 5 of [6], p. 93 when the desired probabilities of 
making errors of type I and II are a=.02 and В=.03 respectively 
(see [6], p. 90). The number of items of type 1 observed (number of 
defects observed) is recorded as in Table 5 of [6] (or graphed аз ш 
Fig. 11 of [6], p. 94) until the procedure leads to acceptance or rejection 
of the null hypothesis. - : 

2.1.5. Estimating relative longevities from several inspections. In the 
preceding sections the only information used in estimating relative 
longevities was obtained by studying the composition of a sample of 
2n items from the population of 2N items on the zth inspection. In this 
section, information obtained by studying the composition of samples 
of 2n items from the population on zth, 2 ath, 3 sth, +, k ath in- 


(10) 
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spection will be used to estimate the relative longevities. A simple 
estimate of the relative longevity of type 1 items would be the propor- 
tion ni(k)/2nk of type 1 items among the 2nk items observed in the | 
k samples; that is, ni(k) = У ут: where л; is the number of type 1 
items observed at the izth inspection. 

The bias B(p) of this estimate is minimized when the switch policy 
is used. When z is large (and/or when n/N is small) we can assume that 
the dependence between the k samples is negligible and the mean square 
error of the estimate can be determined approximately from the usual 
formula for the mean square error of a sum. Whence, the mean square 
error of the estimate is about 21 Рғ (1 p)Pri (2| p)/2nk?4- B*(p). 
When the switch policy is used the mean Square error approaches 

| Dala/ [(L1+-12)? 2nk] more rapidly than other policies of the type 
investigated. 

2.1.6. An advantage of the symmetric type of replacement policy. One 
of the advantages of replacement policies of the type investigated i8 
that they have the effect; of keeping in use on the average more items. | 
of the type with the greater longevity. That is, when Li/(L--L) >}, 
the average number of items of type 1 in use at any inspection time 
vis 


(11) 2N Pr,(1| р) > М. 


This average number approaches 2NLi/(Di-d-Ls) when the replace- 
ment rule is in use for a long period of time. Since 2N Pr.(1|p) is à | 
monotone function of р, the switch policy has the effect of keeping in 
use on the average more of the type with the greater longevity than 
the other policies of the type investigated. This effect occurs immedi- 
ately (even at the first inspection) and the effect continues to increase 
with the length of time the replacement rule is in use. Hence, even if 
the sample composition is not observed in order to choose which type 
has the greater longevity, the use of the replacement policy insures us | 
that more of the type with the greater longevity will be in use on the 
average. | 
Aside from logistical considerations, it would seem more desirable to 
have а replacement policy which would lead asymptotically to ex- | 
clusive usage of the type with the greater longevity. Replacement 
: policies having this desirable property could be devised. However, 
these policies make reference to past results (information concerning: 
failures at the preceding inspections). In Appendix A4 it is proved 
that if only “simple” replacement policies which are easy to apply 
under operational conditions ате considered, then it is impossible to. 
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devise a policy which would lead asymptotically to exclusive usage of 
the type with the greater longevity. By a “simple” replacement policy 
we mean a policy which determines the number of replacements 
of each type using at most the following two facts: (1) the population 
composition after the last inspection, and (2) the number of failures of 
each type since the last inspection. 

2.1.7. A logistical consideration. Suppose logistical conditions are 
such that about the same number of each type of equipment is ayail- 
able. It would then be desirable to adopt replacement policies which 
use about the same number of each type of equipment as replacements, 
If а symmetric type of replacement policy is applied, about the same 
number of each type of equipment will be used as replacements (see 
Appendix A5). 

We shall see in Section 4.2 that the replacement policies of the 
symmetric type are the only replacement policies among a more general 
type of policy which have this property of equal usage of replacements. 
When the switch policy is adopted, fewer replacements will be needed 
on the average (see Appendix A5). 

2.1.8. Estimating absolute longevities. In the preceding sections the 

only information used in estimating and ‘testing hypotheses was ob- 
tained by observing the composition of a sample (or samples) of items. 
In this section we consider the case where information concerning stock 
(that is, the number of items that have been replaced since the first 
inspection) is also available. Using this added knowledge, the absolute 
longevities of the types of equipment can be estimated. 
. Applying the results of Appendix A5, the average of the number Rz 
of replacements ued at the zth inspection is 2N Pr,(r| p) (see equation 
(57) ) and the average of the number 22-і R;= Us of replacements 
used since the first inspéction is 


к(й.|>) = 2; 509 = 2N È РАВ 


(12) = ANz/(Ia + Ls) 


Ty — LN? 
+5 -[ - 26 tank (FZ) ; 


Hence, U./ANz is а biased estimate of 1/(L:+J2), but the bias 
approaches zero when the replacement policy is used for a long period 
of time. The bias approaches zero most rapidly for the switch policy. 
Since the variance of U,/4Nz will approach zero as 2 becomes large, 
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U./4Nz converges in probability to 1/(L;--L2) and 4Nx/U, converges 
in probability to L:+Lz (see [2], p. 255). 

If both the sample composition and information concerning stock is 
available, the absolute longevity, say, Гл can be estimated using 


(13) 2Nan/U.n. 


From Sections 2.1.2 and 2.1.3 we see that the limiting distribution of 
nı is a binomial distribution with parameters 2n and L;/(L;4-L;) when 
x approaches infinity. Hence the expected value of the limiting dis- 
tribution of 2Nzm/U;n is (2+1) [L/(Li4-23) ] =Тл; that is, the 
bias of the limiting distribution of the estimate is nil. The square root 
of the mean square error of this estimate of absolute longevity can be 
estimated by 


(14) (=) VAS. 


Using Slutsky's theorem (see [2], p. 255) the estimate is found to be 
consistent when n is also large. 

It should be pointed oui that if information concerning stock is 
available and no logistical restrictions are imposed, replacement policies 
other than those investigated in the preceding sections might be better. 
For example, consider the replacement policy that an item which 
fails will be replaced by the same type. Clearly, more items of the 
type with the shorter longevity will be used on the average, contrary 
to the logistical consideration pointed out for the symmetric policies in 
Section 2.2.7. Hence the logistics must be such as to make it possible 
to supply any proportion of each type. Considering the history of a 
single item of type 1 in the initial population, we see that the chance 
that a replacement will be needed at the zth inspection is fı. The 
average of the number V(1) of replacements of type 1 used since the 
first inspection is Naf,. Hence, V(1)/Nz is an unbiased estimate of fi. 
Also, since the variance of V(1)/Nz approaches zero as the replacement 
rule is used for a long period of time, V(1)/Nz converges in probability 
to Л ала Nz/V (1) converges in probability to L;. Hence, when each 
item which fails is replaced by its own type, consistent estimates of 
longevities are obtained by using only the information concerning 
stock. If information concerning stock is not available, then it is ob- 
vious that this replacement policy is uselsss when relative longevities 
are to be ealeulated from population or sample composition because 
the population composition remains the same. | 

2.1.9. Comparison of switch and 50-50 policies. The 50-50 policy 
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has the advantage that it is not necessary to observe the type of an item 
which has failed in order to replace it, since the chance that a replace- 
ment will be of type 1 is the same whether the failing item is of type 1 
or of type 2. We shall now make some direct comparisons between the 
50-50 policy and the switch policy. 

In the preceding sections the switch policy was found to have vari- 
ous optimum properties. It should be pointed out, however, that al- 
though the switch policy is “best” in various senses, it is only “slightly 
better” than the other replacement policies of the type investigated. 
For example, the difference in bias B(p) of the estimate т1/2т when 
the 50-50 policy is adopted rather than the switch policy is 


TIU 
E B(1/2) — B(1) = ilz D {Пп - d o 69/2] 
St ДАЛЕ). 


This difference is small when the policies are used for a long period of 
time. The ratio B(3)/B(1), however, approaches infinity when the 
policies are used for a long period of time. 

Although the switch policy is only slightly better" in various senses 
than 50-50 policy, to reduce the bias of the estimates of the relative 
longevities to any specified amount takes at least twice as many inspec- 
tions under the 50-50 policy as under the switch policy (see Appendix 
A6). Also to reduce the mean square error of the estimates to any speci- 
fied amount takes at least twice as many inspections under the 50-50 
policy as under the switch policy. However, the difference between the 
mean square errors of the estimates is small when the policies are used 
for a long period of time. i KON 

2.1.10. Length of the inspection interval. In the preceding sections it 
has been assumed that the chance is f; that an item of type $ will fail 
in the time interval between two successive inspections if it had not 
failed at the earlier inspection. A small change d in the time interval 
between successive inspections might have the following effect: The 
probability is /.(1--4) that an item of type t which had not failed by 
the zth inspection will fail by the (e+ 1)th inspection, when the time 
interval between inspections has been changed by a small amount d. 
In that case the bias B(p) of using n1/2n as an estimate of Lı is 


1Г1»%— І, 
17274) ЕУ ЫРУ E Фф]. 
(16) ТЕСТА ШЕ ДЫ 


When p(fi--f2) (1--d) <1, the bias is a decreasing function of d and 
hence will be minimized when d is maximized. Therefore, when z inspec- 


* 
514 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1953 


tions are to be made, the bias and the mean square error will be mini- 
mized, and the power of the tests will be maximized when the time 
between inspections is the largest interval which will change the prob- 
ability of а failure by a multiplicative factor. This, of course, will in- 
erease the total time which will elapse before estimates are made. 

2.1.11. Length of the replacement interval. Suppose the replacement 
policies are modified so that replacements are to be made if necessary 
only on inspections 2, 22, 32, - * -. That is, even if items fail immedi- 
ately after inspection jz, they will not be replaced until inspection 
(7--1)2. When г> 1, this method has the advantage of requiring fewer 
replacement inspections in a given time period. It has the disadvan- 
tage of permitting the total number of items in operation to vary 
somewhat. If the replacement policies are used for a long period of 
time, the bias and mean square error of the estimates of relative 
longevities will be minimized and the power of tests concerning relative 
longevities will be maximized when items which have failed are re- 
placed at the next inspection; that is, when z=1 (see Appendix A7). 

2.1.12. Continuous inspection and replacement. In the preceding sec- 
tions we consider the case where inspections are made at periodic inter- 
vals at which time the iteins which failed are replaced. Most of the 
results obtained in the preceding sections for the case of periodic in- 
spections will hold even in the case of instantaneous replacement upon 
failure since the basic difference equation obtained in the former case 
(see Appendix A1) is analogous to the differential equation obtained 
in the latter case (see Appendix А8). 


2.2. A General Туре of. Replacement Policy 


2.2.1. Definition of the type of replacement policy. Consider the fol- 
lowing type of replacement policy. When a type 1 item fails, the chance 
is р that its replacement will be of type 2, and when a type 2 item fails, 
the chance is Mp that its replacement will be of the type 1. Without any 
essential loss of generality, we can assume that M >1 and 0c p €1/M. 
When M —1 we have the symmetric type of replacement policy inves- 
tigated in Section 2.1. 

__ By reasoning similar to that in Appendix A1, the probability that a 
descendant of an item in the initial population will be of type 1 at the 
ath inspection may be represented by 


Prali | p) = Prea(1| p) — др — Mp) + Mp 
(17) = LhM/(AM + L) 
1 [2 - LM 


zu и | [1 — pfi + РМ) 


ж 
* 


Е 
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If fı+f:M € M, the absolute value of the second term in the sum is 
minimized when p attains its maximum value 1/M. Hence, the “switch 
(M)” policy (that is, when a type 2 item fails, it will be replaced by an 
item of type 1, and when a type 1 item fails, the chance is 1/M that 
its replacement will be of type 2) will have some optimum properties 
similar to those proved for the switch policy. { 

2.2.2. Estimating and testing hypotheses concerning relative longevities. 
Suppose a sample of 2n items is drawn at random from the 2N items 
of which the population is composed at the zth inspection, Writing n; 
ав the number of items of type ? observed in the sample, the expected 
value of n1/2nM is Pr. (1| p)/M. If the replacement policy has been 
‘in use for а long period of time, the bias of nı/2nM as an estimate of 
Lı/(LıM +L) approaches zero most rapidly when the switch (М) 
policy is adopted. Also, the mean square error of the estimate is small- 
est in that case. When т is also large, the mean square error of the 
estimate approaches zero and hence m:/2nM converges in probability 
to 1,/(1,М--1Һ). Also, nı/nM is a consistent estimate of Li/Ls, 
and ni/(ni4-n4M) is a consistent estimate of L;/(Li-- La). 

Suppose samples of 2n items are drawn at random from the popula- 
tion on the zth, 2 zth, 3 2th, * - - , k th inspection. Writing n:(k) as 
the total number of items of type i among the 2nk items observed in the 
k samples, the statistics ni(k)/2nEM, nı(k)/n(k)M, and mi(E)/Ini(5) 
+m(k)M] are consistent estimates of Li/(L:M+L:), 11/12, and ТЛ 
/(д--1Һ), respectively, when kn is large and the replacement policy 
has been in use for a long period of time. 

А reasonable symmetric procedure for choosing which type of equip- 
ment has the greater longevity is the following: Choose type 1 if 
MM [пі € c and choose type 2 if ns/mM Sc, where с<1 is a constant 
to be determined by the amount of indecision we are willing to permit, 
The probability of making an incorrect decision, say, of choosing type 


s _ 2 when in fact type 1 has the greater longevity is 


d cM2n — 
P = Pr,[n, < cM(2n — n)] = Б ES И 
_ (18) j 
= X One [Pr | ру Ре р) ет", 
ny=0 
| which may be computed directly or approximated by the normal ap- 
_ proximation to binomial variates. 
.. Sequential and nonsequential methods analogous to those pre- 
^. Sented in Section 2.1.4, may be used to test hypotheses. concerning 
| relative longevities. DU а А 
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2.2.8. А logistical consideration and estimates of absolute longevities. 
Тһе symmetric type of replacement policy has the property that 
about an equal number of items of each type will be used as replace- 
ments (see Section 2.1.7). Let us now consider the case where logistical 
conditions are such that about M times as many items of type 1 may 
be used for replacements as items of type 2. This logistical restriction 
will be satisfied if а replacement policy of the type defined in Section 
2.2.1 is adopted (that is, if the chance that а type 2 item will be re- 
placed by its opposite type is M times the chance that a type 1 item 
will be replaced by its opposite type). We shall see in Section 4.2 that 
replacement policies of this type are the only policies among a more 
general type of policy which satisfy this logistic restriction. When the 
switch (M) policy is adopted fewer replacements will be needed on the 
average. 

Тһе formulas presented in Appendix А5 for the symmetric type of 
replacement policy (M —1) may be generalized to the type of replace- 
ment policy investigated in this section. We obtain the following re- 
sults: Considering the history of a single item in the initial population, 
the probability that a replacement of type 1 was used at the ath in- 
spection is i 


Pre = 1| p) = Praal | Df — р) + Pri (2| p)fMp 

(19) = М/(ҺМ + L) + [Па р) + ^Мр]/2 

— M/UAM + L)] [1 — р, + РМ), 
When the replacement policy is in use for a long period of time, we have 
(20) Profr = 1|p] = Profr = 2| p] M = M/[LM + L]. 
That is, about M times as many type 1 items"will be used for replace- 
ments of type 2 items. The probability that а replacement will be 
needed is 
Pri(r| p) = Ре. = 1| 2) + Рг. = 2|») 

= (М + 1)/(ҺМ + L) 
HIN — pi + М) | (М? — nz: 
+ (2, — L)]/LM + 1). 
If information concerning stock is available (that is, if the number 

О, of replacements used since the first inspection is known), we can 
estimate absolute longevities. By an approach similar to that of Section 


2.1.8 we find that U,/2N (M --1)z converges in probability to 1/(14М 
+L») and 2N(M+1)2/U, converges in probability to (LıM +L) 


(21) 
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when x becomes large. Hence, N(M+1)an/U,Mn сап be used to 
estimate the absolute longevity Гл. The bias of the limiting distribution 
of this estimate is nil when the replacement rule is used for a long period 
of time, and the estimate is consistent when n becomes large. 


3. A NUMERICAL ILLUSTRATION 


We have attempted to reproduce a situation similar to the adoption 
of replacement policies in a restaurant [1] where thirty tumblers are 
utilized and the replacement policy is used for ten weeks (z—10). А 
small number (2N —30) of tumblers were used in order to simplify 
the simulation. The restaurant situation was simulated by use of a 
table of random numbers. The chance f; of a failure by the zth week 
for a type 1 tumbler which had not failed by week z—1 was taken as 
.50 and f, as .25; thus the relative longevity of type 1 tumbler was 
ful (fif) =}. The initial population consisted of 15 tumblers of each 
type and the 50-50 policy was adopted. At the end of the tenth week 
there were 14 type 1 tumblers in use. An estimate of the relative longev- 
ity of type 1 tumblers based only on the composition of the population 
at the end of the tenth week would be 14/30 —.467 (see Section 2.1.3). 
The expected value of the estimates is — * 


Prio(1| $) = 4 + GOGL- 3675)]'? = 335 — [see equation 2]. 


The square root of the mean square error is estimated by 
/(467)(533)/30—.091 [see equation (6). From equation (5) we 
find that the square root of the mean square error 18 ш fact 
V/(385)(665)/30-1- (002) = .086. Suppose the type of tumbler which 
appears most frequently in the sample is chosen as the type with the 
greater longevity. We would have correctly chosen the type 2 tumbler 
on the basis of sample tomposition. The probability of making the 
incorrect decision of choosing type 1 is computed from equation (7) 
and found to be 


14 
(22) р = Y; 07.665)! (.335):0—? = .026 
359 
The simulation of the restaurant situation Was repeated using the 
Switeh policy for making replacements. АП other conditions were the 
same as in the preceding simulation (that is, Л-.50,һ-.25, and the 
initial population consisted of 15 tumblers of each type). At the end 
of the tenth week there were 11 type 1 tumblers in use. An estimate 
of the relative longevity is 11/30 —.367. Тһе expected value of this 


estimate is 
Q3) Prio(1 | 1) = 4 МЕ aea m 48)? = 3888 
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[see equation (2)]. The square root of the mean square error is es- 
timated by 4/(.367)(.433)/30 = .088 [see equation (6)]. From equation 
(5) we find that the square root of the mean square error is in fact 


(24) /(.333)(.667)/30 = .086. 


Suppose the type of tumbler which appears most frequently in the 
sample is chosen as the type with the greater longevity. We would have 
correctly chosen the type 2 tumbler on the basis of sample composition. 
The probability of making the incorrect decision of choosing type 1 may 
be computed from equation (7) and found to be 


14 
(25) P = Х.С/(667)(.333)9-і = .025. 
j-0 


We also found that 88 tumblers were needed as replacements during 
the 10 weeks the switch policy was used. With this added information 
estimates of the absolute longevities, which in fact were L,—1 /.50=2 
and Г» = 1/.25 =4, can be obtained. Using equation (13) the estimates 
ате 2(10)(11)/88 = (.367)600/88 =2.5 and (.633)600/88 —4.3, respec- 
tively. The square root of the mean Square error (the standard error) 
of these estimates is estimated as approximately (.088)600/88 —.6 [see 
equation (14)]. In this numerical illustration we have found that the 
estimates of the relative and absolute longevities were all within one 
standard error of their true values when the switch policy was used. 


4. COMPARING k TYPES OF EQUIPMENT 


4.1. Definition of the Type of Replacement Policy and Estimates о) Rela- 
live Longevities 


Consider the following type of replacement policy: When an item of 
type 7 fails, the probability is руу that its replacement will be of type 
j (i, j=1, 2,-+-, k and 2 3-1P;=1). Then the probability that the 
descendant of an item in the initial population will be of type u at the 
tth inspection is 


(26) Pr.(u) = Pr.a(u)s, + » Pr, (Әри 
pei 
or 


(27) Рг.(и) - Prza(u) + Przi(u)fu = > Рг, а (а), 
il 


Where f; is the probability that an item of type i which had not failed 
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by the tth inspection will fail by inspection (4-1, and s;=1—f;. 
This type of replacement policy clearly does not include all possible 
replacement policies. A policy which is not in this class is: When ап 
item fails replace it by one of the type opposite that of the preceding 
replacement. The first replacement to the population will be of type 1 
with probability 3. This policy differs from those investigated herein 
in that the type.of replacement is determined by the type of the pre- 


‚ ceding replacement rather than by the type of item which failed. This 


policy is similar to the 50-50 policy since about; 50% of the replace- 
ments will be of the type different from the item which failed. Also, 
when inspections are made at discrete inspection times, about equal 
numbers of each type of equipment will be used as replacements if 
either this policy or the 50-50 policy is adopted. For practical purposes 
the analysis presented herein of the 50-50 policy may be used as an 
approximation to the analysis of the other replacement policy. 

We shall consider the case where the ра; and f; are such that Pr.(u) 
has a nonzero limit for u=1, 2, ---, k, when x becomes large (see 
[3], p. 325). Hence, 


k 
(28) Pra(u)fu = Prlu)fu = > РЕГ: 


for u=1, 2, ..-, k, and the distribution Pr(u) is uniquely determined 
by this system of equations. Hence, Pr(u) =1./У № Ил, if and only 
if ipu -1foru-, 1,2, +--+, For k=2, this implies that рири 
=рь-Ер» or pa—pn-p. In other words, the symmetrical type of 
replacement policy is the only type of policy among those investigated 
herein which will in the long run make the average population composi- 
tion of the various types proportidnal to their longevities. ў 
We also have that Ри) = М. ./ У.М, if and only if 


E? Мер =M, for u=1, 2, ++ -,k. For k=2, we see that the type of 


replacement policy studied in Section 2.2 is the only type among those 
described herein which will in the long run make the average number of 
type 1 items in the population MZ1/Zs times as great as the number 
of type 2 items. | 

For any given replacement policy which has been in use for a long 
period of time, estimates of the relative longevity Т/Х t-1L; may be 
obtained which are based only on the composition of an observed sam- 
ple. That is, letting рг; describe the given replacement policy and n, be 
the number of items of type observed in a sample of n items, we first 
solve the following system of k linear equations for Mu, 


520 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1953 
k 

(29) М„= > Мірш. 
i=l 


Then nu/nM, and т.Мг/УЕ Мү are consistent estimates of 
Lauf? t-ıM:L; and Т./У Гл, respectively, when n is large. 


4.2. A Logistical Consideration and Estimates of Absolute Longevities 


The chance that a replacement of type u will be needed at the zth 
inspection as a descendant for an item in the initial population is 


(30) Рет = u) = »» Рг. (ри. 
i=l 


Hence, when the replacement rule has been in use for a long period of 
time, 


8) Pre(r = u) = Pr(r = u) = È Profou = Ри(и)у. 
Therefore 

(82) Pr(r = u) = Mu 2 Mili, 

where the values of М, are determined by the system of equations, 
(33) м. = È Morn foru = 1, 2,---,k. 


For k=2, we see that the type of replacement policy investigated in 
Section 2.2 is the only policy among, those described herein which will 
in the long run use for replacements M times as many type 1 items as 
items of type 2. 

If the proportions M./ У) „М; of the total replacements which are 
of type и are given by logistical considerations, then the type of replace- 
ment policy which will satisfy this supply restriction may be deter- 
mined by the {p;;} satisfying the system of equations 


k 
М. = o» M piu, 
(34) ші 


k 
155- ра 
ші 


for u—1,2, - - - , k. For example, if it is desired that thé piu be inde- 


——— ім Jr ra RENE Mo NNUS CD ETT 
¥ 
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pendent of i (that is, p;,—p«, where the type of failing item need not 
be observed), p, must be equal to 


(35) M./ > М. 
i=l 


Since У, Pr(r=u)= Хы, М./УЕ, ML; the average of the 
number U, of replacements used since the first inspection will be ap- 
proximately 

k k 

(36) (vd м.) / У, ма, 

u=l i=l 
where the population consists of N items. We find that Ш„/(Мх У, 
М.) converges in probability to 1/ У, M;L; and (Nx У, Mu)/Uz 
converges in probability to У; M;L;, when z becomes large. Hence, if 
information concerning stock is available, then n.Na (326: M2/U.n M. 
is a consistent estimate of the longevity Lu of items of type u when the 
replacement rule has been in use for a long period of time and where n 
is large. 


5. THE NONPARAMETRIC NATURE OF THE ASYMPTOTIC RESULTS 


In the preceding sections it was assumed that the probability that 
an item of type i, which had not failed by the tth inspection, will fail 
by inspection t+1, was a constant 1/L;. In this section we shall show 
that even if this assumption is not true the asymptotic results which 
we have obtained in the preceding sections are still valid; that is, the 
asymptotic results do not depend on the assumption of a constant risk. 

Let us assume only that the equipment has a finite life span, and let ғ 
az; be the probability that an item‘of type 1 will serve for exactly z in- 
spections, У), 1 a2:=1. The longevity of type ¢ items is 


(37) DY аы = È bzi = Li, 
т-і т=1 


Where бы = У); ан is the probability that an item of $уре? vill serve 
for at least т inspections. The quantity аң/бы = cz: is the conditional 
probability that an item of type $ which has not failed on the first 2-1 
inspections will fail on the zth inspection. We wish to show that the 
asymptotic results do not depend directly on the values of the az; (or 
bzi, or сы) but only on the values of the L;. For z sufficiently large, the 
Probability that the descendant of an item in the initial population will 
be of type u and will have served for exactly ¢ inspections 18 


(38) Pr.(u, i) = Prea(u, = 1) — сы) 
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for #21, and 


k 
(39) Pr,(u, 0) = » x Praal, j — сара, 
where Piu is the probability that a replacement for an item of type 2 will 
be of type u. 
Again we shall deal with the case where the Piu are such that Pr.(u, 0) 
has a nonzero limit for и=1, 2, - - - , К, when 2 becomes large (see [3], 
pp. 325 and 275). Then 


Pr..(u, t) = Pr(u, t) = Pr(u, 0) » (1 — сш) 
j=l 


(40) 
= Pr(u, Obrus 
and 
k 
Pr(u, 0) = Х У Pr(i, Обоз 
ізі j=l 
А k 
(41) = У) Pr(i, 0)pin Y, aj 
ізі j=l 


k 
= D Prl, 0)piu 
i=l 


The probability that a descendant of an item in the initial population 
will be of type u approaches 


(42) Pr(u) = У Pr(u,t) = Pr(u,0) Ð beau = Pr(u, 0) Lu 
t=0 t=0 


Hence, the proportion M,= Pr(u, О/УЫ, Pr(i, 0) of the total re- 
placements which are of type u is determined by the system of equations 


k k 
(43) М. = У Мао. ХМ:-1, 
i=l 4-і 
and the probability that an item drawn at random from the population 
‘will be of type u approaches M,L./ У, Мн. Also, У, Уо 
Pr(u, t) Ула Pr(u, 0)L,—1. Therefore, the probability that an item 
of type м will be needed as а replacement is 


(44) Pr(u,0) = М./ Уз М. 


` 


LIFE OF EQUIPMENT UNDER OPERATIONAL CONDITIONS 523 


_ From these facts we see that when a given replacement rule is used for 
_ а long period of time, the results which we had obtained in the preced- 
ing sections remain valid (for equipment with a finite life span) even 
when the risk is not constant. j 


APPENDIX 
^ Al. The Difference Equation 

We shall prove that the quantity Pr.(1|p) given by equation (2) 
represents the probability that an item will be of type 1 if it is drawn 
at random from the 2N items of which the population is composed at 
the «th inspection. Consider the history of one of the items of type 1 in 
the initial population. It may fail on, say, the 6th inspection and then 
“might be replaced by an item of the opposite type, which will happen 
— with probability p. The replacement might then fail on the 3rd inspec- 
| боп and then its replacement might be of the opposite type (with prob- 
ability p). This history might be described by the sequence of num- 
bers 

1, Wyld 209/99 MESS 
i < 
or, more generally, 
Uo, Ua, Us, Us, Ua, Us, Uo, Wr, Us, 19,777) 

where и, —i when the “descendant” of an item in the initial population 
is of type 7 at the 2th inspection. If a member of the initial population 
is drawn at random, the probability is 1/2 that it will be of type 1 

since the case where the population was initially composed of N items 
LL f each type of equipment is being considered; that is, 

© Pr(uo=1| p) =Pro(1| p) =1/2. We'see that u=1 if either 

(a) ш=1 and the item did not fail on the first inspection, 

(b) ш=1 and the item failed on the first inspection but its replace- 


ment was of the same type, or i à ў 
(с) ш=2 and the item failed on the first inspection and its replace- 


ment was of the opposite type. 
"The probabilities of (а), (b), and (c) are 


(а) Pro(i P)s1, 
(b) Pro(1| p)fi(1—p), 
(с) Pro(2| ру)», 


- respectively. 
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Hence the probability that u; —1 may be computed by summing these 
three probabilities. That is 


Pr(w = 1|p) = Pn(1| p) 
(45) = Pro(1| р) [5 + Л(1 — р)] + Pro(2| pfp 
= Pro(1| p) — fip — fep) + fap, 
since Pro(2| р) 21— Pr«(1| p). Similarly 
Pn | p) = Pri(1| »)(1 — fip — зр) + fp, 


46 

ic Prai |p) = Рт(1|р)(1 — fip — fap) + Sep, 
and 

(47) Pr.(1| р) = Pral | p)(1 — fip — fap) + fap. 


Since it is here assumed that Pro(1| p) = 1/2, 
Pri(1| р) = [1 — р, + £2]/2 + fap, 
j Pa |>) = [1 — рл + £2]*/2 + fp [t — р, + £2] + Ар, 
Pup) = [1 — р + Л) + fll — р, + f9]* 
+ fop[l — рф + f] + Ль, 


(48 


and 


Pr p) = [1 — 24+) + fp 3; [1 р + 


2-0 


С Са Атар 


(49) -П- р(ћ + Л) |=/2 
Л — [t — »(f t lG +n) 


ау f 1 Ай Л T 
Suc [5 іу П->(Л--/)1 


1ГІз- Ly 
= І, pes em = 
/(1 + Le) + 3 Ё * а р + fa) | 

The quantity Pr.(1|p) given by equation (49) represents the prob- 
ability that, if an item is drawn at random from the 2N items of which 
the Population is initially composed, its descendant will be of type 1 | 
at the zth inspection. Furthermore, if an item is drawn at random from 
the 2N items of which the population is composed at the zth inspection, | 
the probability that it is of type 1 is also Pr.(1| р). 
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A2. The Mean Square Error 


We shall prove that among all replacement policies of the type in- 
vestigated in Section 2.1. the switch policy leads to estimates which 
have the smallest mean square error. The mean square error is (see 
Section 2.1.3.) 


(50) Pr.(1| р)Рг.(2 | p)/2n + B*(p). 


In Section 2.1.3. it is seen that the absolute bias | B(p)| is à mono- 
tonically decreasing function of p. Hence, B'(p) is minimized when the 
switch policy is adopted (i.e., p —1). It is therefore sufficient to prove 
that Pr.(1|p) Pr(2|p) is в monotonically decreasing function of p. 
Since 


|4 — Pre(1| p)| =|$ = Pre(2|2)| 


(51) - SEE] - oa er 


is а monotonically increasing function of p, the distance between 
Pr.(1| p) (or Pr.(2|p)) and 1/2 is maximized when p=1. Hence the 
function Pr.(1|p) [1—Pr«(1| р)] =Pr,(1| p)Pr(2| p) is minimized when 
9 — 1. Therefore, the mean square error is also minimized when p= 1. 


АЗ. A Theorem on Binomial Variates 

We shall prove that the quantity P given by equation (7) is a de- 
creasing function of Pr,(1|p); that is, P(g) = 21520 e (ии: 
is an increasing function of g when 0«g <1 and d «n. This follows from 
the fact that the derivative of P(g) with respect to g, 


ӘР 814 

(52) aP@) =— Уу Orl- 048% 
94 9g :-0 

is equal to 

(53) ner — 040155, 


which may be proved by mathematical m on d. Since the deriv- 
ative is positive, P(g) is an increasing function of g. 

This КЕШ, ds of. d in itself а worded as follows: Let x be 
the number of successes іп n Bernouilli trials where the probability of 
success for а, trial is рі, and let Y be the number of successes when the 
probability of success for a trial is p2. Then X is stochastically larger 
than Y if and only if р: is larger than pz- This fact follows from the 
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preceding result since we have seen that the probability that X will 
less than or equal to d is an increasing function of 1—p, and, hence, a 
decreasing function of pı. ) 


А4. An Impossibility Theorem 


We shall prove that it is impossible to devise a “simple” replacement | 
policy which would lead asymptotically to exclusive usage of the type 
with the greater longevity. A replacement policy is defined as “simple” 
if the policy determines the number of replacements of each type using 
at most the following two facts: (1) the population composition after 
the last inspection, and (2) the number of failures of each type since. 
the last inspection. In other words, a “simple” replacement policy is. 
an integer valued random variable R(mi, та, c) whose distribution de- 
pends on the number m; of failures of type 1 equipment since the last 
inspection, the number m of failures of type 2 equipment since the last | 
inspection, and the number c of type 1 pieces of equipment in the рори- 
lation after the last inspection. The random variable Е(ті, ms, c) de- | 
notes the number of type 1 pieces used as replacements. For example, | 
for the 50-50 policy R(m:, ть, с) is а binomial variate with parameters. 
тита and 1/2. For the switch policy (mi, ms, c) = ma (that is, itis a i 
binomial variate with parameters m and 1). All replacement policies 
of the symmetric type investigated in Section 2.1, are “simple” policies. 
If the chance is p that the replacement for an item which failed will be 
of the opposite type, R(m:, ть, c) is the sum of two binomial variates, 
the first variate having the parameters m and p, the second variate | 
having the parameters m, and (1—2). Since the distribution of В (ma, 
tra, с) for the symmetric type of replacement policies depends only on. 
mı and m», we could devise replacement policies which were “simple 
but were not of the symmetric type. Of course, replacement policies | 

-which depend on c would not be simpler than rules which do not de- | 
pend on knowing the total population composition after the last 
Spection. ja 
i Since the distribution of m; depends only on c and Lı and the dis- 
tribution of m, depends only on 2N —c and Ls, the expected number. 
of type 1 items in the population after replacement is а function 
_ G(c, Li, Ls) of the composition c after the preceding inspection, and 
the longevities Z; and Г. Writing Pr.(c) to represent the probability | 
that the population will contain c items of type 1 at the zth inspection, 
we have that the expected number of type 1 items in the population at 
the following inspection will be E... (c*) = У", G(c, Г, La) Pral). 
If the replacement policy leads asymptotically to exclusive usage of the 
type with the greater longevity, 
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(54) Ша Pr.(2N) =1 and lim E,(ct) = 2N 
«зе puse 


when L;2L»; Hence it is necessary that G(2N, Li, Lj) -2N and 
G(0, Li, Із) 20, which means that once the population consists ex- 
clusively of one type of equipment then only that type will be used as 
replacements at future inspections. Therefore the probability is at least 
Pr,(2N) that type 2 equipment will no longer be used as replacements 
in any inspections following the zth inspection. Since the probability 
is at least Pr.(2N)+Pr.(0) that the decision to exclusively use one 
type of equipment will be made at ће zth inspection or before, there 
is a finite probability that this decision will be incorrect. Hence there 
is no certainty the replacement policy will lead asymptotically to ex- 
clusive usage of the type with the greater longevity. 


A5. The Necessary Supply of Replacements 


Consider the history of a single item in the initial population. А re- 
placement of type 1 was used at the «th inspection if either (a) the 
descendant at inspection z—1 was of type 1 and failed by the zth in- 
spection and was replaced by the same type (chance is 1—p), or (b) 
the descendant at inspection z—1 was of type 2 and failed by the ath 
inspection and was replaced by the opposite type (chance is p). Hence 
the probability that a replacement of type 1 was used at the zth inspec- 
tion is 

Prír = 1| p) = Pra | РЛС — p) + Рл. 102 | op 
(55) = ya + L2) + ла р) + ЛР 
= + в) — w+ A) 
When the replacement policy is in use for a long period of time, about 


the same number of each type of equipment will be used as replace- 
ments; that is, 


(56) Рг. = 1| p) = Pra(r = 2] p) = 1/( + L2. 
Тһе probability that а replacement will be made at the zth inspec- 
tion is : 
Рг. | р) = Prr = 1| р) + Рг( = 2|») 
= 2/(1л + Lə) 
67) + [Git f9/2 — Wa + Ы — 2+5" 


= 2/( + L) { 
T [1 = 26 +ЛЕ + 14//1414 7а - Ia). 
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- Since the second term on the left side of the equality is positive, the 
probability that a replacement will be made is a decreasing function of 
p and assumes its minimum when p=1. That is, when the switch policy 
is adopted fewer replacements will be needed on the average. 


A6. Comparison of the Switch and 50-50 Policies 


We shall prove that to reduce the bias of the estimates of longevities 
by any specified amount takes at least twice as many inspections un- 
der the 50-50 policy as under the switch policy. From equation (4), we 
see that the bias of the estimate n1/2n when the 50-50 policy is adopted 
for y inspections will be greater than the bias when switch policy is 
used for z inspections, as long as 


(58) [1 — % 9/2 y > [1 — 6 fl. 
When у 522, assuming fi4-f? 0, we have that 
[1 — 65/2 y > [1 (+ 5/2] 
= [t - (+) + (+ 91/4 > [1 (А). 


Hence, when y S2z, the bias using the 50-50 policy is greater than when 
the switch policy is used. 


А7. An Advantage to Replacement at Each Inspection 


Consider a modified form of the symmetric type of replacement poli- 
cies defined in 2.1.1. where replacements are made if necessary only on 
inspections z, 22, 3z, - - - . We shall prove the statement made in Sec- 
tion 2.1.11, that, if the replacement policies are used for a long period 
of time, the bias and mean square error of the estimates of relative 
longevities will be minimized and the power of the test concerning rela- 
tive longevities will be maximized when items which have failed are 
replaced at the next inspection; that is, when 2-1. 

_ Let f; be the probability that an item of type i will fail in the time 
interval between successive inspections if it did not fail at the earlier 
inspection. Then the chance that the item will fail in the time interval 
between inspections jz and (j+1)z is Sitsfitsefit +--+ +80 Yi 
=1—s#=f,(z). If the replacement policy is used for a long period of 
time, the number of items of type 1 in a random sample of 2n items 
from the population composition at the zth inspection (z is large) will 
have a binomial distribution with parameters 2n and gi(z)=f2(z2)/ 
[А (0) -(2)] approximately. When z—1, (:(1) =/5(1)/[1(1)-+(1)] 
=11/(L:+1s). In order to prove that the bias and mean square error 
of the estimate т1/2п is minimized when 2=1, it is sufficient to prove 


(59) 
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that gi(z) is а monotonically decreasing function of z when Li>L2. The 
function gi(z) is monotonically decreasing if g:(2)/ [1— gi (2) | is mono- 
tonically decreasing. Hence, it is sufficient to show that g:(2)/ [1— gi(2) 
»9:(24-1)/ [1 —g1(z4-1)] when Li» Ls; that is, 


(60) П-“/Пһ-ә1>П-еИ/П- ағ) 
ог 
(61) [1 —з]/[1 — st] < B = з*/ [1 о. 


This last inequality holds if [1—8]/ [1 —57*1] =A(s) is a decreasing func- 
tion of s for 0<s<1. The derivative of h(s) is 
(62) s[—2+ (e 4- s ве вен] 


which is negative for 0 «s «1. 


А8. The Differential Equation 

Let us assume that the chance is f;dt that an item of type ? which had 
not failed by time t will fail before time /4-Ф. Consider the type of re- 
placement policy where the chance is p that a replacement will be of 
the opposite type from the item which fails and where there is instan- 
taneous replacement upon failure. Consider the history of a single item 
in the initial population. Then the chance that its descendant will be of 
type 1 at time t+dt is 


Praa(1| р) = Pr | p) [fidt — р) + 1 — fidt] 
(63) + Pri(2| p) (ара) 
= Pr,(1| p)[1 — (Лр + fop)dt] + fapdt. 


This equation is analogous to the difference equation (47). Writing 
P(t) =Pr,(1|p), then 


(64) Pie + d) — РФ = [fa — POG + f2]pdt 
and 
(65) 20 = № — Р(0(л + fr) |p. 


Since it is here assumed that P(0) = 1/2, the solution of this differential 
equation is 


iam ет, 


1 
(6) Ра = 00а +1) + sie 


530 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1953 


This solution is analogous to the solution (equation 49) of the differ- 
ence equation (47) upon which most of the results in the preceding sec- 
tions were based. 
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ON SOME PROCEDURES FOR THE REJECTION OF 
SUSPECTED DATA ; 


E. P. Kine 
National Bureau of Standards 


овт of the statistical tests that have been recommended for the 
detection of a single outlier involve the difference between the 
largest (or smallest) observation and some measure of the location of | 
the remaining members of the sample. In many situations, however, 
there is no a priori basis for anticipating which extreme will be under | 
suspicion, with the result that this decision is based entirely on sam- 
ple evidence. The test statistic in such cases actually employs the *more 
deviant" extreme, and this reordering is seldom taken into account. In 
this note we shall show that, for two statistics commonly used to de- 
tect the presence of a single outlier, the effect-of this “two-sided” hy- 
pothesis is approximately, but not exactly, to double the significance 
level of the standard test procedure. 
In cases where an accurate estimate of the population standard de- 
viation, c, is available, let us consider the statistic 


in = 5 2—1 
Un = ог ш = 
с 


с 


where х, and 21 are the largest and smallest observations, respectively, 
in a sample of n, and 2 is the sample mean. Under the null hypothesis 
of random sampling from a normal population, the distribution of Un 
(or ш, since the two are identically distributed) has been tabulated by 
Grubbs [2]. Let 8 


и = max (ui, Un) 

under the same null hypothesis, and let G,(f) denote the distribution 
function of u in samples of n. We have at once 

G,() = P(u «t, Ur < t). (1) 
In case the sample size, n, is large it follows from the asymptotic inde- 
pendence of u, and u, that 

G,(t) = Fs) 
approximately, where Ғ,() denotes the common distribution function 
of и» and ил. This is equivalent to 
531 у i 
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1- 6,(9 = [1 — Е„(0][2— {1 — 2,0} |. 


Thus the 100о: per cent point of и, (or 1) is the 100a (2— o) per cent 
point of и. For practical purposes, where a< -10, the 100о: per cent 
point of и, is the same as the 2000 per cent point of u, Which means that 
the critical values of и, can be used provided that the significance level 
is doubled. 

For small samples, where n=3 and n —4, the distribution function of 
и is available in [3]. Using this function and the known percentage 
points of ил, the corresponding significance levels for the statistic u are 
obtained. The results are given in Table I. 


TABLE I 
SIGNIFICANCE LEVEL FOR “и” CORRESPONDING TO 
А GIVEN SIGNIFICANCE LEVEL FOR “и,” 


P{uZun(a)} 
a 
n .01 .05 .10 
3 .018 .084 .159 
4 .020 .089 .170 
© .020 .097 .190 


In cases where c is unknown, Dixon’s statistic [1] of the form 


En — Tni ж — т 
Ta = ——— or ri = — 


Тл — d 24-24 


may be used where г1<21<-.. <@n1<2» is an ordered random sam- 
ple of п from a normal population. If we let 


т = max (ni Ta) 
and denote the distribution function of r by S,(z) we have 
Sale) = P(n € z, rn < 2) 
= P(n < 2) + P(r, < г) — P(r or т, < 2). 
Since P(r, <z or т, X2) S1, we obtain the inequality 
8.2 > P(n < г) + P(r, < г) — 1. 


Finally, letting R,(z) denote the common distribution function of ri 
and r, in samples of size n the above inequality becomes 


S45) = 2®„(д)—1 fo 052<1... 


PROCEDURES FOR REJECTION OF SUSPECTED DATA 533 
which is equivalent to 
2p — 8, (2) = 1- Se) 0<2<1. (2) 


Hence the 100a per cent of r, is, at most, the 200a per cent point of 7, 
regardless of sample size. It is easily verified that (2) becomes an equal- 
ity for $ S2 1 and a strict inequality for 0X2 «3. 


REFERENCES 


[1] Dixon, W. J., *Ratios involving extreme values," Annals of Mathematical 
Statistics, 22 (1951), 68-78. 

[2] Grubbs, F. E., “Sample criteria for testing outlying observations,” Annals of 
Mathematical Statistics, 21 (1950), 27-58. 

[3] King, E. P., “The operating characteristic of the control chart for sample 
means,” Annals of Mathematical Statistics, 23 (1952), 384-95. 


ON THE USE OF RANGES, CROSS-RANGES AND 
EXTREMES IN COMPARING SMALL SAMPLES 


Hannes HYRENIUS 
University of Gothenburg 


1. INTRODUCTION 


Г ORDER to simplify and reduce the arithmetical work involved in 
analyses of statistical quality control, attempts have been made to 
derive useful substitutes for the ordinary methods of comparing two 
samples. Among these, a mention may be made of the quotient of two 
ranges or two mean ranges as alternatives to the variance ratio in test- 
ing for homogeneity in variation. In а recent article [3] Paul R. Rider 
presents the distribution of the quotient of two sample ranges, the uni- 
verse being defined by the rectangular distribution. 

The purpose of the present article is to describe a procedure by which 
it is possible to test not only differences in variation but also differences 
in location. The method is based on the assumption of а rectangular 
universe. Д 

The definitions and the derivation of the sampling distributions are 
given in Sections 2-6. Section 7 shows the relation of the new test of 
variation to that studied by Rider. Section 8 gives a discussion of the 
proposed test statistics, indicating their usefulness and limitation. Sec- 
tion 9 explains the test tables given as an appendix. Finally, some nu- 
merical examples are presented in Section 10. 


2, DEFINITIONS 
The universe to be sampled is the rectangular distribution 


1 
(1) p(z)dz = p? 0zzsEB. 


We adopt the following notations: 


Sample 1: №: items } lower extreme =u; upper extreme =v; 
Sample 2: N, items; lower extreme =u ; upper extreme =» 


We choose as our sample 1 the sample having the smaller lower ex- 


treme, thus u1 Su. If u1 =u», sample 1 may be taken as the sample first 
obtained. 


The ranges are 
(2a) Ru = v — u, 
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(2b) r Roz = v2 — Шш. 
We define the cross-range as 
(3) Ры = — ш, 


and the lower-extreme-difference as 
(4) Sn = ш — ш. 

The analysis is to be performed by means of the following test-quo- 
tients 


(ба) Тш сш шш, 


R. UN 
(5b) USA ышы: 


(бе) yi Se 


The three quotients are related by the equation 
(6) T+U=VYV. 
3. THE DISTRIBUTION OF 7 


The distribution of T' can be derived in the following way. 
For a general universe p(z), the distribution of the lower extreme, и, 
in a sample of N itemsis given by 


o дә = Now| [оа]. 


If, specifically, p(z) is the rectangular distribution given by (1), this 
formula reduces to 


N 
(0), Ли) = uy ET шу, 
For the upper extreme, v, we have the general expression 
v N-1 
® so) = N| f poa) 29 


which for the rectangular case takes the form 
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(8) т 
BY E 


Holding the lower extreme of sample 1, 11, fixed, we find that h 
distribution of vı is the distribution of an upper extreme in a sampli 
of Nı— 1 items, lying within the range и, €zX B. We hence obtain _ 


fan) = Gd e (и — uj). 


— u)h- 
For uz (2%) we find 
Ns 
fau) = Е (В — w), 


The joint distribution of v; and м» for a fixed value и; hence is gi 
by the product of the last two distributions 


fan, из) = fun) fau). 
Writing u; = ш T (v; —u;), and integrating over vı, we obtain 
Те 
M+ r 


69 faln = 4) = o - ns E ("T 
r=0 r 


corresponding to the case w< Vi. 
The distribution is independent of u, and obviously gives the gen- 


eral distribution of the quotient T for all values of u from 0 to B. 
For ш> we obtain 


| ма 
(9b) = (Ni — IN, E peus j 

d ‘ À ) 22 ( »( T = 
1=Т5 


It is easily seen that the total area for all values of the quotient T 
adds up to unity. 


4. THE DISTRIBUTION or U 


| The distribution of the range quotient U is derived from the joi 
distribution of л, 1, and v; for a fixed value u 


fun, w, v2) = G veu it аи, Жала, 
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Introducing %-%- U (vi — uj), and integrating for u and v, we ob- 
tain the distribution of U as 
(Ni—1)N2(N2—-1) 
ul =f(U)= Ni+N2—1)U%22 
(108) 2200 (EN: D (09) Noc 
— (N14-N3—2) 0—1, 08051. 


The distribution is independent of u, and is thus valid for the whole 
span 0- B. for U>1 we obtain 


(Ni — М, (М, - 1) 1 
(М + М» — 1)(% N.—2) Um" 


(10Ь) f(U) = 1< <<, 
5. THE DISTRIBUTION OF y 


The distribution of the cross-range quotient V is derived analogously 
from the joint distribution of v, and % for a fixed value ш: 


M-1 ig Na MN 
favi, т) = (B — ши (vi — Ы Т ДИЛ, | ш) EM 
Writing v» —4, — V(vi—w) and integrating, we obtain 
(Ni — № 
11 rains у О Иа 
(па) МУ) =I) татті 
Ni—1N. 1 
(11b) ieee, 15Vs». 


NMi+N2-—1 V" 


6. MEANS AND VARIANCES OF Т, U, AND V 


А few remarks should’be made about the sampling distributions of 
the three quotients, From the type of population distribution it is im- 
mediately clear that, with increasing sample sizes, the quotient T' with 
probability 1 tends to 0 while U and V in the same way tend to 1. The 
distributions reduce in the limits to the unitary distribution for T —0, 
U-landY-1. . ) 

Ав for the means we find the general expressions 


(122) ш(Т)- Bish ARN 
N;—2 М+1 

(120) „ш = Be 
М-2 Ne+1 
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Мұст! M 
№-2 М-1 


(12е) in (V) = 
The variances are 
(№ — 1) [Qn — 2)?N; + (№ + 2)] ' 
(Ni — 8)(Ni — 2)%(М» + 1)2(N2 + 2) 


(Ni = 1)(N2 — D [2081 — 2)? + Ni(N2 + 1) — 2] 
ото я (№ — 80%, — ЭМ, + 052 ^ 


(№ — ПМ! [(№ — 2)? + (N: + 1)? — 1] 4 
(Ni — 8)(№, — 2)*(М, + 1)2(N2 + 2) 


The formulas conform to the statement just given. 
The form of the distributions is illustrated in Diagram 1 for sample 
sizes N1— №, 274. 
7. RELATION OF U TO RIDER’s u 


The procedure of selecting the sample with the smaller lower ex- 
treme as a kind of origin constitutes a difference from the procedure 
used by Rider [3] in testing the range-ratio u = R;/R». 

It is easily verified that the probability of usw; is Ni/(Ni+N2), 
while the probability of ш <и is N2/(N, 17- N3). From this the relation 
between the distributions of U and и is found in the following way: 

If we distinguish the two distributions (10 a) and (10 b) as fa(U) and 
(О), we may derive their complementary functions by exchanging 
№ and №», and at the same time changing U into 1/U. The two new dis- 

‚ tributions may be denoted gi(U) and gis(U). 

Now, the two distributions of u= R/R», as derived by Rider, are 

obtained by a simple weighting: 


(13а) ш(Т) = 


(18%) ш(У) = 


Хи) А09 "s X gis (u) 
(142) ni Ni(Ni—1)N2(N2—1) 
© (%+®МӘ(%һ+М»—1)(М-ЕМ,—2) 
— (М+М: 2) и], 05051, 
Ix galu) H falu) 
(14b) N Ni(Ni—1)N2(N2—1) 
(ММ) s 4- Ns — 1) (М.М, —2) [ 
-—UENI-2)- ^3], Isuse. 


[Qvid №) им 


Ли = 


(Ni+N2)u™ 
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Wit 


у y. T у У 
.025 .025 95 975 975 
Diagram 1. The distributión of T, U, V for N19 № =4, 


_ The shaded areas indicate: For 7: the upper 5 per cent tail. For U and V: the 
lower and the upper 2.5 per cent tails. 


A similar weighting could, of course, be made with the distributions 
_ of T and V. Because of the ‘complication arising і from the signs of the 
“differences ши; and v,—u; (i, j— 1, 2) it is considered preferable to 
- Че the procedure adopted here, i.e. selecting the smaller of the two 
| ‘Ower extremes as a starting point. 


в. DISCUSSION OF THE TESTS 
The three quotients T, U, and V may be briefly бае in the 
ollowing way: 

_ T gives a test for possible differences in location Й 
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U gives a test for possible differences in variation 

V gives a test for possible differences in location or in variation or in 
both. 

The use of a range-ratio in studying differences in variances may not 
call for any specific justification. As for the use of extreme values in 
comparing differences in location, however, there are obviously quite а 
number of possible characteristics with a higher or lower degree of 
efficiency. 

The general desirability of using a most efficient statistic is very 
often hindered by the mathematical complexity in deriving the sam- 
pling distributions. On the other hand, the inefficiency of a statistic 
may sometimes be balanced by the simplicity of its calculation and 
use, as is the case in several types of routine work such as, e.g., statisti- 
cal quality control. Although the extreme values usually are very in- 
efficient, statistics, they are in some respects good when dealing with 
samples from rectangular universes. The power efficiency of the pro- 
posed statistics and some related statistics is going to be studied. 

In this connection the question arises how other population forms 
might affect the sampling distributions of TU V, and the relative ef- 
ficiency. Tt is, e.g., of interest to know the critical values in the case 
of a normal or nearly-normal universe. These questions are being stud- 
led for a set of different frequency functions covering a variety of pop- 
ulation types. 

From preliminary results arising from these investigations it may 
be noted here that very skew population distributions give rise to con- 
siderable deviations in the sampling distributions and hence in the 
critical values. For symmetrical and moderately asymmetrical uni- 
verses, the critical values show fairly small deviations from those ob- 
tained for the rectangular case. į 

Because V includes the effects of sampling variations in both T and 
U, it may obviously, under certain conditions, fail to reveal existing 
large deviations of the two addends, namely, whenever the deviations 
of T and U are in opposite directions. On the other hand, a significant 
value of V may be due to large although non-significant values of T 
and U. When using V, one must therefore interpret the results with 
more care than is necessary when applying the simple Т- and U-tests. 

As already indicated the three tests presented here are primarily 
thought of as being useful in routine work in statistical quality con- 
trol. Under such circumstances it may sometimes be considered useful 
to apply the V-test ав a first guide in “hunting for troubles.” Usually, 
however, it seems better to use the separate T- and U-tests. The tests 
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might accordingly be referred to as the TU-tests or, if the V-test is also 
being considered, the T'U V-tests. 


9. TABLES 


In Tables 1-3 are given percentage points of the distributions of T, 
U, and V for sample sizes from 2 to 10. It is to be noted that one ИЯ 
use two-tail tests with regard to U and V, while 7 is to be used as a 
one-tail test. The critical values are consequently given by means of. 
the 10, 5, and 1 percentage points for T, and by means of the 99. 5, 
97.5, 95, 5, 2.5, and 0.5 percentage points ds U and V. (See Diagram 1. `) 


10. NUMERICAL EXAMPLES 
In order to illustrate the use of the TUV-tests, a few examples are 


given. 
4. Applying the tests to the data used by Rider [3] we have 


М: = 5, w=72, w-80 
№ = 10, ш-75 ш=79. 
From these values we obtain М 
T - 0.375, U = 0.500, V = 0.875. 


It is found from the tables that T' is significantly large at the .05 
level, while U is low at the .01 level. The last result is, of course, in 
agreement with that which was shown by Rider when using his u-test. 
The use of the T'U-tests thus reveals more differences than could be 
seen by the w-test alone. 

The fact that T and U deviate in opposite directions leads to the re- 
sult that V does not give any significant indication even on the .10 
level. 

B. In the factory control of ай ball bearings at the 
S.K.F. in Gothenburg, the groove location was measured from a cer- 
tain specified norm. Samples of 4 being taken, the following data were 
obtained at two different occasions: 


Sample А 4, 3, 0, 1, 
Sample В 0, —1, 0, —2. 
Sample B being our sample 1, we obtain 
T = 1.0, U = 2.0, 2-40! 
It is found that there is а shift in location, significant at the 5 per 
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TABLE 1. UPPER PERCENTAGE POINTS OF Т 
Given two samples from continuous rectangular populations, call the sam- 


ple with smaller minimum observation the first sample. 


Let №; number of observations in the i-th sample 


v; = minimum observation in the i-th sample 
?; = maximum observation in the i-th sample 


Then 
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TABLE 2. UPPER AND LOWER PERCENTAGE POINTS OF U 


Using the same notation as for Table 1, 
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TABLE 3. UPPER PERCENTAGE POINTS OF Y 


Using the same notation as for Table 1, 
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cent level, as well as a slight shift in variation, significant at the 10 per - 
cent level. The cross-range test V is also significantly high. 

C. Data on production of metal knobs (W. B. Rice, Control Charts 
їп Factory Management, Tables 4 and 5). Four polished metal knobs 
were measured, in 1/1000 inch, for two kinds of steel with different 
hardness. The following values were observed: 


Sample 26 742, 744, 742, 737 
Sample 31 749, 744, 749, 747 


By calculating 2 and s* we find by ordinary methods that the vari- 
ances do not differ significantly and that the averages differ at the 5 
per cent level (#=3.15 with to;=2.45). 

Using the more rapid TU-tests we have T — 1.00, U —0.71. The tables 
show that there is a difference in location at the 5 per cent level but no 
difference in variation. The findings are consequently in accordance 
with what was obtained in the more laborious way by the t- and F-tests. 
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THE DISTRIBUTION OF THE PRODUCT OF RANGES 
IN SAMPLES FROM A RECTANGULAR POPULATION 


PauL В. RIDER 
Washington University and Wright-Patterson Air Force Base 


INTRODUCTION AND SUMMARY 


N AN earlier paper [4] the distribution of the quotient of the ranges 
I of two independent, random samples from a continuous rectangular 
population was given and discussed. (See also [3].) In the present paper 
the distribution of the product of such ranges is derived. 

For the distribution of the product of the ranges of two independent 
samples a general formula is given. This formula fails to hold if the 
sample sizes are the same or if they differ by unity, and special con- 
sideration has to be given to these two cases. 

The distribution of the product of the ranges of k independent sam- 
ples of equal size is derived. 

Simple formulas for the moments about the origin are given for all 
of these distributions. 


PRODUCT OF TWO RANGES 


The population which we wish to consider is the rectangular popula- 
tion given by 


1 fo 055581, 
0 elsewhere. 


лә = { 0) 
For the purpose of keeping formulas simpler, a unit interval for the 
independent variable has been chosen. While the interval over which 
the population extends has no effect on the distribution of quotients of 
ranges, it will obviously have an effect on the distribution of products 
of ranges. However, there is no essential loss of generality in assuming 
that this interval is unity. 

It is well known that the distribution of ranges in samples of size n 
from the population (1) is 


n(n — 1)z"-*(1 — z)dz. (2) 


(See, for example, [2], page 192. In this reference the distribution is 
given in cumulative form, which must be differentiated to yield (2).) 
If 2; is the range of a sample of size m and х, the range of a sample of 
size n from (1), then if the samples are independent, the joint distribu- 
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tion of zı and 2% is 

m(m — 1)n(n — 1)z,"7r-*(1— z)(1 — т). (3) 
We wish to determine the distribution of the product и =з. 

To do this we first replace, in (3), т by 217% and dz; by z;-du. This 
yields ‘ 
m(m — Y)n(n — Пи" Ц — z)(1— ааш) лац. (а) 
To get the distribution of и we integrate (4) with respect to zı from u 
to 1, since these values are the minimum and the maximum, respec- 
tively, that ту can attain. After some simplification, the distribution of 
u is found to be 


m(m — 1)n(n — 1) Шал ааа op o шарды) 


(т — п) [(т —n)? – 1] 
+ [м —» —1) = (m — n + Iulur}du, (5) 


in which m—n is different from 0 and +1. 

Since, if the samples are of unequal size, it is immaterial which is 
larger, we shall assume m2n and consider the cases m—n=0 and 
m—n=1, that is, the case in which the samples are of equal size and 
the case in which one sample contains one more item than the other. 

For the case m—n=0, we find, by using the same method, that the 
distribution of и is 


— n(n — 1)2u"-2[2(1 — u) + (1 и) log uldu, (6) 
Similarly, for the case m—n=1, we get 
n?(n? — 1)и”-2($, — Зи? + u log u)du. (7) 


PRODUCT OF k RANGES 


We shall now consider the product of k ranges, limiting the discus- 
sion to the case of equal sample sizes. ME 

То derive the distribution of the product of three ranges in inde- 
pendent samples from the population (1) we replace the variable u in 
(6) by y and multiply the result by (2). This gives 
= n(n — 1n — zy [20 — 9) + (1 + y)logy]dzdy. (8) 


Since y is the product of the ranges of two independent samples and z 
is the range of a sample which is independent of either of the samples 
involved in the product y, (8) gives the joint distribution of z and y. 

If we let ш = гу, then u is the product of the ranges of three inde- 
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pendent samples from (1). We replace, іп (8), т by y~u and dz by 
уташ. The distribution of u can now be found by integrating with re- 
spect to y from y=u to y —1. 

Repeating this process again and again, we find that the distribution 
of the product u of k ranges of independent samples of size n from the 
rectangular population (1) is given by М 


(—1)*7!n*(n — ни [a(l — u) + a(l + u) log u + a(l — u) log? и 
+: + mafl + (—1)'u} log™tu]du, (9) 


in which 
1 
4-1 (k 22 1)! , 
k 
а 
(k+r—2)! 
Фе =) 
(k — 1)!(r — 1) k — r)! 
(2k — 2)! 
a = Та DF p. Dij: . (10) 


Тһе result сап be proved by mathematical induction. 
The distribution for k=2 is given by (6). The distributions for k=3, 


4, 5, 6 follow. 
n'(n — 1)и"—2[6(1 — м) + 3(1 + u) log и + 3(1 — u) log? u]du, 
— n'(n — 1)ш°—[20(1 — u) + 10(1 + u) log w+ 2(1 — u) log? u 
— #(1 + и) log? u]du, 

n(n — 1) [7001 — u) + 35(1 + u) log u + 4&(1 — u) log? u 

+ #(1 + u) log? u + Ak (1 — и) logt uldu, 
— n*(n — 1)%щ"—[252(1 — u) + 126(1 + u) logu + 28(1 — и) log? u 
(RO + v) Пори + 1(1 — ш) юри + а + и) log? uldu. (11) 


MOMENTS 


For the distribution of the product of the ranges of two independent 
samples of size m and n respectively, with m—n>1, the moment of or- 
der j about the origin is 
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т(т — 1)n(n — 1) 


, 


мо" (9 
If m—n=1, then 
$(52 — 1 
uj ais ) (13) 


ЕТЕТІСГЕТТЕГІН 


For the distribution of the product of k independent samples of n 
items each, the jth moment about the origin is 


раса ШЫДА 
(n + 3)*(n +9 — 1)* 


The formulas for the central moments are not simple and conse- 
quently are of no particular interest. 


, 


ш (м) 
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CONFIDENCE AND TOLERANCE INTERVALS FOR 
У THE NORMAL DISTRIBUTION* 


Frank PnoscHAN 
National Bureau of Standardst 


Confidence and tolerance intervals for the normal distribu- 
tion are presented for the various cases of known and un- 
known mean and standard deviation. Practical illustration 
and interpretation of these intervals are given. Tables are 
presented permitting а comparison among the intervals. Fi- 
nally the relationship between the two types of intervals is 
described. 


1. Introduction. Discussions of the theory of errors will sometimes 
state that the mean plus or minus the probable error will include 50% 
of future observations (assumed normally distributed). This, of course, 
is true only if the mean and the probable error of the population itself 
are used. Unfortunately, in most practical problems one or both of 
these may not be known. Experimenters who use the sample mean plus 
or minus the sample probable error with the expectation that this 
interval will contain 50% of future observations may be seriously delud- 
ing themselves. 

However it is possible to construct intervals of the type tks 
(@=sample mean, s sample standard deviation) which will, on the 
average, include 50% of the population. From this one is led to a more 
general consideration of such intervals, and to the uses to avhich they 
can be put. 

All populations discussed in this paper are normal unless otherwise 
specified. Let и, с refer to the population mean and standard deviation 
respectively. 

Any one of four possible situations may exist: (а) и, о both known; 
(b) и unknown, о known; (c) и known, ø unknown; (d) и, с both un- 
known. 

Let т represent either и or 2 ;let s. d. represent either о or s. Then two 
important types of assertions may be made about intervals of the form 


m + k sd. (1) 


hy ence interval, The probability is ү that the interval (1) 
злу the population mean (or, alternatively, the second sample 
mean). 


* Presented at the annual meeting of the American Statistical Association, Boston, December 1951. 
Y Now at Syracuse Electric Products, Ino., Hicksville, N. Y. 
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B. Tolerance interval. In a large series of repeated samples the pro- 
portion of the population contained in (1) is 

(B1) a, on the average $ 

(B2) Р or more, у of the time. 

In this paper, a comparison is made among the values of k appro- 
priate to the respective cases obtained from various combinations of 
A and B with (a), (b), (c), and (d). Practical illustrations and inter- 
pretations are given of these cases. 

In addition, details are given of a proof of a result by Wilks (1941) 
for'the case B1. These details are given because they are suggestive of 
a general method applicable in such problems, Also, tables are presented 
of values of k for a certain class of confidence and tolerance intervals. 

Finally, the relationship between confidence intervals and tolerance 
intervals is discussed. 

2. Definition of symbols. For convenience, the definitions of symbols 
are brought together in this section. 

и =рорща1юоп mean 

с = population standard deviation 


- € 
La 


Z=- mean of a sample of n observations, 


У, (ес 
TT ; sample standard deviation 
na 


т=н ог 2 


зй.= о or 8 


p=proportion of the population contained in тр s.d. where k 


=constant 
Given a normal distribution with ш-0, ¢=1. Then : 
La=normal curve deviate which is exceeded in absolute value 
with probability а biel 
tem1=Student-t value for n—1 degrees of freedom which is ex- 
ceeded in absolute value with probability o. A 
X*«,-1— Chi-square value with n—1 degrees of freedom which is ex- 
ceeded with probability o. 
3. Confidence intervals. А chemist mak 
content, и, of а solution. What interval 


es n determinations of the iron 
shall he select so that he can 
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assert with 50%, confidence that и lies within that interval? The dis- 
tribution of observations is normal with mean в. 

8.1 For the population mean, standard deviation known. First, consider 
the case where the chemist knows с. (The determination is of a routine 
type, for which a great many sets of previous observations are avail- 
able, from which c is calculated.) In this case 

e .6745 
2 — 
vn” 


will contain the “true” value (population mean) 50% of the time. 
This may be seen from the following diagram: 


A 


Lay off AB: u+ (.0745/4/n)c, and CD: zx (.0745/4/n)e. Notice that 
when 2 lies in АВ, и must of necessity lie in CD ; and when 2 does not 
lie in AB, р must lie outside of CD. But since 2 is normally distributed 
with mean и, standard deviation (c//n) the probability is .50 that 2 
willlie in AB. Hence the probability is .50 that CD contains p. 

Values of ki = .6745/ Ул for n=2(1) 30, 40, 60, 120, © are presented 
in Table 1. 


‚ To generalize, when the confidence coefficient is y, the confidence 
interval for д is 


2:5 
ул 


Population mean, standard deviation unknown. Consider 
те the only information about c is in the present sample. 


2 + 


с 


3.2 For the 
the case whe: 


CONFIDENCE AND TOLERANCE INTERVALS 553 
Then the interval 


l.50,n-1 
———s 

An 
will contain и, 50% of the time. The proof is similar to that of Section 
3.1. Values of k2=t,50,.-1/-/n for n=2(1) 30, 40, 60, 120, © are pre- 
sented in Table 1. Comparison of kı and k; shows >, but авл о, 
kki. 

In general, when the confidence coefficient is y, the confidence 
interval is 


ФЕ 


3.8 "Confidence interval" for second sample mean. Suppose the 
chemist who made the iron determinations wishes to set up a confidence 
interval, not for и, but for the mean, Z», of a second sample of n; ob- 
servations. Such an interval might be called more appropriately а 
prediction interval, since the term “confidence interval” generally 
refers to population parameters. H 

Let us call the meañ of the first sample Zi, and its size nı. Since the 
statistic 


2: — 2 
m reso 
81 ү ск 
т т 
is distributed as the Student-t ratio, it follows that the interval 
4 POR TUE 
dc шем + "gU (2) 


will constitute а 50%, prediction interval for % [1]. 

What does this mean? It simply means that if pairs of samples of 
Size n; and m respectively, with means Zi and d» (i—1, 2,...) 
respectively are drawn repeatedly, then for 5076 of these pairs 2 


will lie in 
1 1 
ды Е Lond = ^p ^ Ai. 


It does noi mean that if one first sample of size nı with mean 2 is drawn, 
to be followed by the drawing of a great many "second" samples of 
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2 TABLE 1 
FACTORS FOR 50% CONFIDENCE INTERVALS 


For в в L^ 
тім алалы В НЕРСЕ ИРА А о о а 
с known (К) ог 
unknown (U) E y 0) 
керы аним ЫВ 
Form of interval tko 2 + 2 ав 
кыы ЕЕ ne НО ООН 
n kı ks % 
2 477 .707 1.000 
3 389 ат 666 
4 337 .382 541 
5 302 .331 469 
6 275 .297 420 
7 255 .271 384 
8 238 .251 356 
9 225 .235 333 
10 213 .222 314 
1 203 .211 299 
12 195 .201 285 
13 187 193 273 
14 180 .185 262 
15 174 .179 253 
16 169 .173 244 
17 164 .167 237 
18 159 .162 230 
19 155 .158 223 
20 151 .154 218 
21 147 .150 212 
22 144 .146 207 
23 141 .143 202 
24 138 .140 198 
25 .135 ‚137 194 
26 132 .184 190 
27 130 .132 186 
28 127 .129 183 
29 125 1127 , 79 
30 123 .125 176 
40 107 108 152 
60 087 .088 124 
120 062 .062 087 
eo 0 0 0 


For discussion 
see Section 3.1 3.2 3.2 
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size па with means Z» (і-1,2...) then for 50% of the “second” 
samples 2; will lie in (2). : 
When т =: = the coefficient of s; in (2) becomes 


on 
kg = Gel Я 
т 


Values of ks for n=2 (1) 30, 40, 60, 120, © are given in Table 1 for 
purposes of comparison. Note that k= 4/24; simply. 

In general, if the “confidence” coefficient is to be y for 2», then the 
interval to be used is 


Ы n 
4S Es34/—T—8. 
M Mm 


( 4. Tolerance intervals. In Section 3 an interval of type (1) was formed 

! to contain the population mean (with a certain confidence). Suppose 

_ now, we are interested in setting up an interval of type (1) which will 
contain a certain proportion p of the population. Such an interval is 

| known as a tolerance interval. 

22 1f either р or o is unknown, then the interval (1), containing & or 8, 

J 


п 


= 


р is random. Hence the proportion p contained in (1) will be а random 
= Variable. 

4.1 Average value of p. In Section 4.1 we determine k so that on the 
average the proportion p is equal to a, a constant. In Section 4.2 we 
determine Ё so that in a large series of samples from normal universes 
8 certain proportion y of the intervals (1) will include a proportion p 
or more of the universe. 

4.1.1 Population mean and standard deviation known. In this case 


1 ut ke (3) 


` may be used as a tolerance interval. The proportion p contained in 
: (8) is constant, and the appropriate value for specified р may be ob- 

— Чаше from a table of normal areas. Thus for p=.50, №=.6745 
ы (listed in Table 2 for purposes of comparison). 

41.2 Population mean and standard deviation unknown. Unfortu- . 
_ nately, in most practical problems p and о аге not known. Hence 2 and 
| 5 must be used. How shall we determine k so that the average p con- 

tained in 2;-- ks; (/—1, 2, + - -) will be a? х 
Wilks [8] gave a solution without presenting ће details of the proof. 
_ (For an independent derivation see Appendix.) The solution states that 
ы the tolerance limits which will include, on the average, a proportion 
а of the normal universe are 
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1 
т у m 8. (4) 
т 


Values of ka — 1,1 V/n-2-1/n for n=2 (1) 30, 40, 60, 120, © and for 
a=.50, .75, .90, .95, .99, .999 are given in Table 3. This table should 
be of use both to the experimenter and to the quality controller; it 
supplements the values of k given in [3]. In addition, for purposes of 
comparison, Table 2 gives values of Аз, во for n=2 (1) 30, 40, 60, 120, 
oo 


An example of the use of Table 3 is given: 

Example: A quality control engineer measures the voltages of a 
random sample of 30 batteries from his production line. From the 
sample mean voltage 2—7.52 and the sample standard deviation of 
voltages s=.90, he wishes to estimate tolerance limits that will, on the 
average, contain 95% of the population of batteries. Assumin g the dis- 
tribution of battery voltages to be normal, what shall these tolerance 
limits be? 

The tolerance limits will be of the form 2+ К, әб. Entering Table 3 
with п=30, he finds ks,55—2.079. Hence the tolerance limits are: 


7.52 + 2.079(.90) = 7.52 + 1.87 = 5.65 to 9.39. 


Notice that ks, з, =2.079 is larger than the value 1.96 that would be 
used if и and с were known. 

One sided tolerance limits. Suppose now the problem is to find the 
value of k/ such that, on.the average, the proportion of the normal 
population less than z-I-k/ s is a Specified value a. By the same proce- 


hU 28 in the proof for the two sided case (Appendix) it may be shown 
at 4 4 


ы = Es oo. (5 


A similar result holds if the proportion of the normal population 
greater than $— kd s is to be a specified value а, on the average. 

Example: A pilot run of 40 electron tubes is made. For each tube the 
plate current in milliamperes, z, under normal operating voltages, is 
measured ; for the sample 2- 12.25, s=.68. From past experience with 
similar tubes, it is known that z is normally distributed. What pro- 
cedure shall be followed to determine the value of L such that 99% of 
the population of tubes will, on the average, have a plate voltage less 
than L? 

We may write 
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L=Z+ ks. 
Then according to (5) 
k.s” = Ksacon-i = Es, ов. 
Table 3 furnishes ks, әз= 2.455. Hence 
L = 12.25 + 2.455(.68) = 13.92. 


4.1.8 Population mean unknown, standard deviation known. In this 
case an interval of the form 
2 + k (6) 


must be used. Using the same method as in the proof given in the 
Appendix, the following result may be derived: 
If the expected value E(p) of the proportion p of the normal universe 
contained in (6) is to be a, then 
n+l 
n 


в = 


Ii. 


For purposes of comparison, ks is given in Table 2 for a=.50 and for 
n=2(1) 30, 40, 60, 120, =. 

4.1.4 Population mean known, standard deviation unknown. In this 
case the interval 


H + [18 (7) 


must be used. Again using the same method as in the proof of the 
Appendix, the appropriate value for k; to include, оп the average, a 


is given by у 5 


kr = ban 


For purposes of comparison, values of Ё are given in Table 2, for a=.50 
and n=2 (1) 30, 40, 60, 120, c. 

4.2 Confidence statement about tolerance interval. A number of papers 
have been written on the problem of confidence statements for toler- 
ance intervals [2], [3], [6], [7], [8], [9]. The problem may be illustrated 
as follows: 4 

4.2.1 Population mean and standard deviation unknown. Suppose 
the battery engineer mentioned in Section 4.1.2 asked the following 
question: What value of k shall I take so that I can be 95% confident 
that 2+3 will include at least 80% of my population of batteries? 

Bowker [3, pp. 102-107] gives extensive tables of Ё such that "in 
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TABLE 2 Г 
FACTORS FOR TOLERANCE INTERVALS 


that will include, on 

the average, 50% of 

the population. M M v M 
or 

that will include at 

least 5095 of the 

population 5095 of 


the time. 5 / 
u known (К) or К U U K K 
unknown (U) 
с known (К) or K U K U U 
unknown (U) 


Form of interval ро 2+ о ptks рав 


n ka Сш ke ky ks 
2 -674 - 1.225 .826 1.000 1.000 
3 .674 .942 -779 .816 .810 
4 .674 .855 .754 .765 .759 
5 .674 .812 .739 .741 .786 
6 .674 .785 .729 .727 .723 
7 .674 .768 .721 ‚718 ‚714 
8 .674 ‚754 .715 .711 .708 
9 .674 .744 .711 .706 .704 

10 .674 .787 .707 .708 .701 

11 .674 781 .704 .700 .698 

12 .674 .725 .702 .697 .698 

13 .674 721 .700 > .695 .694 

14 .674 +718 -698 .694 .692 

15 .674 .715 .697 .692 .691 

16 .674 .712 -695 .691 .690 

17 .674 .710 .694 .690 .689 

18 .674 -708 .693 . .689 .688 

19 .674 .706 .692 .688 .687 

20 .674 .705 ^ .691 .688 .687 

21 .674 -708 .690 .687 .686 

22 .674 .701 .690 .686 .685 

23 .674 1711 .689 .686 .685 

24 .674 .699 .688 .685 .684 

25 677674 .699 .688 .685 .684 

AA e 26 | .674 .697 .687 .684 .684 


nd 27 -674 .697 .687 -689 -683 
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TABLE 2 (cont.) 
n kı Быш ke kr ks № 
28 .674 -696 .686 .684 ° ` .683 .680 
29 .674 .695 .686 .683 .683 .680 
30 .674 .694 .686 .683 .682 .680. 
40 .674 .689 .683 .681 -680 .678 
60 .674 .685 .680 .679 .678 .677 
120 .674 .680 .677 .677 .676 .676 
© 674 .674 .674 .674 .674 ‚674 
For discussion 
see Section 4.1.1 4.1.2 4.1.3 4.1.4 4.2.2 4.2.8 


a large series of samples for normal universes a certain proportion y 
of the intervals 2+ ks will include P or more of the universe; y is called 
the "confidence coefficient" since it is a measure of the confidence with 
which we may assert that a given tolerance range includes at least P 
of the universe." In these tables y=.75, .90, .95, .99, .999. 

4.2.2 Population mean known, standard deviation unknown. Consider 
the case where и is known and 6 unknown. Then an interval of the form 


и + kes (8) 


can be set up to include at least а proportion P of the population with 


confidence y as follows: h ) 
Let us take specific values of P=.80 and y=.95 for illustrative 
purposes. We note first that p is monotonie increasing with s (and 
with 52). Hence when s? takes on a value exceeded 95% of the time 
(call it 8.952), p will take on a value exceeded 95% of the time. But 
- X.95,n—1° 
ЕЕ S 
8,95" = КЕРІ с“. 


Then the appropriate value of ks is 


ks = Lo/ V/X 35s] = 1). 


Values of ks for р= y —.50 for n —2 (1) 30, 40, 60, 120, <>. are given 
in Table 2, for purposes of comparison. 
For general P, у ' 
ks = в/на — 1). 
4.2.8 Population mean unknown, standard deviation known. In this 
case, an interval of the type ) 
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2 + № (9) 


must be used. Let us solve for № when Р = .80, у= .95 to illustrate the 
reasoning. 

We first note that 95% of the 2’s lie in the interval n+ (L.o5/s/n)o 
that is, 95% of the +h intervals have their centers inside the 
interval u+ (L.os/A/n)e. Now we find ky such that the normal curve 
area between u-F-(L.os/4/n)e — kss and u+ (L.os/ п) ооо is .80. Then 
95% of the Z+ о intervals will contain pz .80 (namely those intervals 
for which 2 lies in p+ (L.os/ ^/n)c). 

It follows that the interval (9) will contain a proportion .80 or more 
of the population, .95 of the time. 

Values of № for P = у= .50 are given in Table 2 for n=2 (1) 30, 40, 
60, 120, ©. For general P, +, ky is the value such that the normal curve 
area between и-- (Li ,/4/n)e с and u-- (L1 ,/ ут) сос is P. 

5. Relationship between confidence intervals and tolerance intervals. 
There is a very interesting relationship between confidence intervals 
and tolerance intervals that may be illustrated by the following ex- 
ample: 

Suppose, as in Section 3.3, we wanted to find a prediction (or “confi- 
dence”) interval for the mean of a second sample. But now let 7-1. 
In other words, we will now be finding a confidence interval for a single 

‚ future observation. According to the result in Section 3.3. our answer is 


2 1 1 n; 1 
Tc li-a nii y= + 7% = 2: + boon M cm 81 (10) 
1 


nı 


where a is the confidence coefficient. 

What does this mean? One way of looking at it is that if repeatedly 
a sample of size n is first drawn and then a second sample of one item 
18 drawn, then a proportion a of the time the single item will lie in the 
interval (10). But a little thought shows that this is exactly equivalent 
to stating that in repeated samples of size ти, the average proportion, 
D, of the population contained in (10) is a. In other words, confidence 
limits with confidence coefficient a for a second sample of size one 
are identical with tolerance limits that will include a proportion a on 
the average. This is confirmed by the fact that (10) is the same as (4) 
(except for the subscript 1). 

The above is an illustration of a theorem by Paulson [5]: 

“If confidence limits Ui(z, . . . , zn) and Оз(т,..., Zn) on a prob- 
ability level = o, are determined for g, a function of a future sample of 
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) TABLE 3 
FACTORS, ksa FOR TOLERANCE INTERVALS SUCH THAT 
2k. s WILL INCLUDE A PROPORTION а OF 
THE POPULATION, ON THE AVERAGE 
Sample віле, n Кю Жы Мө Жын В Кий № 
2 1.225 2.957 7.733 15.562 38.973 77.964 779.699 
3 .942 1.852 3.372 4.069 8.042 11.460 36.486 
4 .855 1.591 2.631 3.558 5.077 6.530 14.469 
5 .812 1.473 2.335 3.041 4.105 5.043 9.432 
6 .785 1.405 2.176 2.777 3.635 4.355 7.409 
7 .768 1.361 2.077 2.616 3.360 3.963 6.370 
8 .754 1.330 2.010 2.508 3.180 3.711 5.733 
9 1744 1.307 1.961 2.431 3.058 3.536 5.314 
10 .737 1.290 1.922 2.372 2.959 3.409 5.014 
11 .731 1.276 1.893 2.327 2.887 3.310 4.791 
12 .725 1.264 1.869 2.291 2.829 3.233 4.618 
13 721 1.255 1.849 2.261 2.782 3.170 4.481 
14 .718 1.246 1.833 2.236 2.743 3.118 4.369 
15 .715 1.239 1.819 2.215. 2.710 3.075 4.276 
16 .712 1.234 1.807 2.197 2.682 3.038 4.198 
17 .710 1.228 1.797 2.181 2.658 3.006 4.131 
18 .708 1.224 1.788 2.168 2.637 2.977 4.074 
19 .706 1.220 1.779 2.156 2.618 2.958 4.024 
20 705 1.216 1.772 2.145 2.602 2.932 3.979 
21 .708 1.213 1.766 2.135 2.587 2.912 3.941 
22 .701 1.210 1.760 2.127 2.575 2.895 3.905 
23 ‚701 1.207 1.754 2.119 2.562 2.880 3.874 
24 1699 1.205 1.749 2.112 2.552 2.865 3.845 
25 .609 1.202 1.745 2.105 2.541 2.852 8.819 
26 .697 1.200 1.741 2.099 2.532 2.840 3.796 
27 1697 1.198 1.737 2.094 2.524 2.830 3.775 
28 “696 1.197 1.733 2.088 2.517 2.820 3.755 
29 “695 1.195 1.730 2.083 2.509 2.810 3.737 
30 “694 1.193 1.727 2.079 2.508 2.802 3.719 
40 1689 1.182 1.706 2.047 2.455 2.741 3.602 
60 “(685 1.171 1.686 2.017 2.411 2.084 3.492 
120 “680 1.161 1.665 1.988 2.368 2.628 3.388 
© “674 1.150 1.645 1.960 2.326 2.576 3.291 


See Section 4.1.2 for a discussion of this сазе. 
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k observations, with distribution V(g), and p=/¥*)(g)dg, then Е(р) 


=a o." 

In the illustration of this section, g corresponds to the value of the 
single future observation and k—1. Similarly we can check the results 
of Sections 4.1.3 and 4.1.4 by the use of Paulson's theorem. 


APPENDIX 


Mathematical proof of Wilks’ result. The details of the derivation 
(independently obtained by I. R. Savage of the Statistical Engineering 
Laboratory, National Bureau of Standards) of the result of Section 
4.1.2 are given, since the method is а suggestive one. 

The problem is to determine k so that the average p contained in 
Si в; (i=1, 2, . . . ) will be a. By an appropriate linear transforma- 
tion, the problem may be reduced to that of finding 


© © ана 
Е(р) = О, f f у eM digg Mog 0701s 
0 —=0 У £—ks 


where C, is a constant free of К. In the following, С; = constant free of k. 
Тһе conditions for differentiating under the integral hold. Hence 
we have 


ôE o pe А 
тоз af f [вет е+ы)? 4- ве 1—0) вп Низ *- 0-7 10?1]ds 
0 У 


ok 
ж LJ 
= 0, f f g MIEL b Gen] ET) ) + Qn FER (п/п 15 "1548 
0 -ю 


ә pe 
+ о, f ij € MIEL 27 (he EDO а-ы оо 15-00548, 
0 J» 


ӨЕ оло 
= = 01 T f ei En sle Кабо и) duds 
0 — 


ok Мп 4-1 
TO f Í ei? gu о E (nfn) duds, 
07 vn +1 
Е o Р 
ДЕТ С, Т g?-1g-k(n— EAE Gm дз, 


Let 


| 
] 
| 
i 
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Then 


ôE = ў п \nl2 
Eu ef dd (а НЕТ 3 Ht 
=@; | 
n п/2 
(s = Е i ) 
Hence 
Ы dk 
Е = Caf T 
( s Yu 
Now let 
n 
^n “/; Sie A 
so that 2 


d di 
Е(р) = С, Г. о 


= 0 "ауа + #)/(n — 1)". 


But the integrand is the well known Student-t density function. Now 
when k= ©, E(p) 21. Hence Cs must be identical with the constant 
of the Student-t distribution. Therefore the result of Section 4.1.2 


follows: 
т--1 
k- у x х 
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A STATISTICALLY PRECISE AND RELATIVELY 
SIMPLE METHOD OF ESTIMATING THE BIO- 
ASSAY WITH QUANTAL RESPONSE, BASED 
ON THE LOGISTIC FUNCTION 


ЈовеРН BERKSON 
Mayo Clinic, Rochester, Minnesota. 


T logistic function is given by 
1 


= — = -------. 1 
ры (1) 
Its straight line transform is 
ІР 
logit P — porcis a+ Ba (2) 


so that, if the logit P is plotted against z, the points will fall on a 
straight line, with о as the intercept and 8 the slope. 

The function (1) has had many statistical applications [37] and has 
been advanced for use in bio-assay by, afnong others, Emmens [17] 
Wilson and Worcester [46], and Berkson [7]. In bio-assay z measures 
the “dose”! and P the “response.” If the response is measured, not in 
terms of a continuous scale such as weight or length, but in terms of 
the observed proportion p affected out of n “exposed,” the response is 
said to be “quantal,” and in this statistical model it is assumed that 
the observation p at z can be considered a random variable binomially 
distributed around the “true” P at z, with variance ay! =PQ/n. 

For this situation, the present writer has advanced a method [7] of 
calculating a and b, which are estimates of a and В respectively, pres- 
ently called the “minimum logit X estimate,” based on a minimization 
of the following quantity, called the “logit хи 


X*(logit) = > пра@ — 0° (3) 
i =1-qi d proportion 
where n is the number exposed at 2, p —1-gisthe observe › 

affected, 1= In(p/q) is the logit of p, Ї=Їп(ў/@) — a-F-bz is the logit Н d 
where ф is the estimate of P at т, given by (1) with estimates a, b re 

placing the parameters, о, 0. | es) 
The minimum lowe x estimate has the following properties? 1. 
The logit X? (3) is distributed asymptotically as X’. 2. The estimate is 


dir б 
1 2 is frequently the logarithm of the dose rather than the dose measured directly. 
? See Appendix Note 2 for reference to other estimates. 
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asymptotically efficient, and therefore as the number of animals at 
each dose n>, the value of the variance of the asymptotic dis- 
tribution is minimum and given by 1/1, where I—E(5 In 9/90), 
$ being the probability of the total sample and 0 the parameter a or В 
which is to be estimated. 3. It is sufficient and therefore, in the con- 
cept of Fisher, extracts the total amount of information available in 


the sample which is relevant to the estimated parameters [25]. 4. For | 


finite samples (a) it has smaller sampling variance than 1/I, and (b) 
has smaller sampling error (mean square error) and smaller variance 
about the mean than the maximum likelihood estimate. These proper- 
ties hold for all values of the parameters [8, 4, 40]. 

Тһе normal equations for obtaining the minimum logit X^ estimate 
of a and В are 


È npl — 1) = 0, (4) 


Упр! — 1) = 0. (5) 


The evaluation of (4) (5) leads to a procedure that amounts simply to 
obtaining a least squares solution of the straight line 


1-a-4 bz 
using npg as weight of the observation 1. "The estimates are given by 


У пр У) прах 


ju 2 прд(@ (= — 2) Қ X тойг — È npa 6) 
È праб — 3)? E npg cree 
È npg 
a= T pg РИ PD pes (7) 


> npa 


where mean 1, {=У "npgl/ пра; mean т, => npqz/? npg. 

Tt will be noted that the equations (6) (7) contain the quantities pg 
and рї, which are functions of D, the observed fraction affected. These 
have been tabled for p as argument giving w=pg and wl — ра! (Table 
3). The estimates then are 


Dd nul У пот 


b= 22900-02 _ 5 nule — È nw ‚ ® 
È n- 2)? TRA (© nuz)? 


È nw 


) 


D 
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У nul — b 95 пох 
> nw 
The Е.Р. 50— у, the dosage value х which produces 50 per cent ef- 


fect, is the value of x for which P=0.5, and is obtained by letting 
P —0.5 in (1), yielding 


(9) 


a-l—bz- 


ү=——. 


The estimate of y, represented as 250, is given by 


a 
ри. (10) 


The equations (8) (9) are explicit solutions of (4) (5) and, being en- 
tirely in terms of the observations, provide directly definitive estimates 
of o and В. 

This is in contrast with the method of probits using maximum like- 
lihood, as advanced by Bliss [10, 11] and by Finney [22] which, begin- 
ning with a provisional solution obtained graphically by eye, requires 
an iterative procedure, in which the maximum likelihood estimate is 
approached asymptotically as the number of iterative cycles is in- 
creased, but which in general does not actually attain the exact defini- 
tive maximum likelihood estimate. "FU 

Tables giving the logit, 1, for argument p (Table 1),* the antilogit, p, 
for argument 1 (Table 2), and w- pg and wl —pgl for argument р (Ta- 
ble 3) are provided. Also illustrated are two graph papers (Fig. 1), 
which have been published’ for the simple use of the straight-line trans- 
form, one with an arithntetically spaced scale for = and the other pro- 
viding also logarithmic scales of different total extent for different 


ranges of dosage. 
CALCULATION OF STANDARD ERRORS 
We may write the estimated logit linear equation as 
1-2 a+ br =a t b(z — 2) 
where a' -1—a-4-bà. 


з For further comments on difficulties inyolved with use of iterative procedures, 
te 1, 


see Appendix 


4 ee Appendix Note 3, on the definition of logit. 1 
* Cost defrayed by the Mayo Foundation for Medical Education and Research, Sold by the Codex 
Book Company, Norwood, Massachusetts. 
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570 TABLE 2 
ANTILOGITS 


Entries give value of р for specified positive value of logit 1; 
if lis negative, p is 1 minus the tabled value. 


TABLE 3 571 
LOGISTIC WEIGHTS ) 
Upper figure із  — pg; lower figure is tl = pgl. 
For p less than .50 on left, wl is negative. For p greater 
than .50 on right, wl is positive. 
Thousandths, for p on left 
? 
0 1 2 
29 | .0000 | .0010 | .0020 .90 
— | .0069 | .0124 
.01 | .0099 | .0109 | .019 .98 
10455 | .0489 | .0523 
.02 | .0196 | .0206 | .0215 .97 
.0768 | .0790 | .0816 
.08 | .0291 | .0300 | .0310 .96 
-1012 | .1084 | .1056 
.04 | .0384 | „0393 | .0402 «95 
.1220 | .1239 | .1258 
.05 | .0475 | .0484^| .04% м 
+1399 | .1415 | .1481 
+06 | .0564 | .0573 | .0582 «9 
.1552 | .1566 | .1580 . 
.07 | .0651 | .0660 1977 | .0685 | .0694 | ‚0702 | .0711 | .0719 | .0728 | .0736 | .92 


572 TABLE 3 (cont.) 
LOGISTIC WEIGHTS | 


Upper figure is ш = рф; lower figure is шї = pgl. 
For p less than .50 on left, wl is negative. For p greater 
than .50 on right, wl is positive. | 


Thousandtbs, for р on left 
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If x is measured as the logarithm of the “dose” D, then zs =10& Dso, 
where zo is the estimate of y, the value of x corresponding to a 50 per 
cent response, and Эв is the estimate of the actual dose producing this 
response. Formulas for variances of the estimates of the parameters 
may be written as follows? 


1 1 
ey = C HEREDES Mad шкын 
È nw bx У nw(r — 2)! 
8% = 8% + 278%, (11) 
1 До ү? 
tan = i аи быть) 


These formulas provide closely accurate estimates of the variances, 
under ideal conditions in which (a) the “true” P’s are given exactly 
by equation (1), (b) the samples are “random” at each dose, the doses 
themselves being fixed quantities, and (c) the number of animals used 
at each dose is large. 

The conditions (a) and (b) can be maintained with satisfactory simil- 
itude in sampling experiments, set up with the use of random numbers 
or the like, mechanical shuffling of cards which have been appropri- 
ately prepared, or where equivalent experimental conditions have been 
deliberately and carefully arranged [4]. However, in the experiments 
with bio-assay as actually performed in the laboratory, this is impossi- 
ble. In the first place, the “assumption” that the “true” P's follow 
exactly some specified function such as (1), or the equivalent statistical 
“assumption” thatthe sampled p’s approach these P's with probability 
approaching 1, as n>, is, of course only an idealization, employed 
to establish a working model, and.can be expected, at best, to be only 
approximately true in fact. But more importantly, even if this ap- 
proximation is close enough to be considered precise, each of the many 
different manipulations involved in accomplishing a bio-assay, such as 
the preparation of specified dosage concentrations, the administration 
of the drug (for instance by spraying of insecticides or injection of toxic 
drugs into individual animals), as well as the unstable behavior of the 
animals from instant to instant, results in variations that have the 
same effect as “errors.” These errors, even when no animal experi- 
ments are involved, are frequently large [2]. All of them influence to 
à small or great degree the variation of the bio-assay, and their net 


ces with estimates then substituted for the param- 


‘ i io varian ; 
They may be derived as tho asymptotio variances with estimates гел енен сар nda 


eters [13, 45]. In the case of spa, the formula is 
Ф(1а 2) =1/2 су, where 2 represents an average. 
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effect is to increase the error of the assay beyond the values given by 
the formulas (11). 

As respects the condition (c), in the practice of bio-assay the numbers 
of animals usually are not large, and even aside from the errors of 
dosage referred to, the formulas (11), since they contain estimates of 
the parameters о, 8 instead of the parameters, are only estimates of 
the asymptotic variances, and themselves have a statistical error that 
isquitelarge. . 

Only by direct observation of bio-assays of the same drug, with а 
program of repeated experiments so designed as to include all the 
sources of variation involved in the bio-assay as ordinarily made, can 
the actual error be evaluated. Very few experiments meeting these re- 
quirements have been performed, but such investigations as have been 
made indicate that the real error of the bio-assay is generally consider- 
ably larger than the values given by the formulary estimates [15, 30]. 
Paradoxically, the discrepancy between actual error and the formulary 
values generally increases with the sample size, because while the 
formulas indicate decreasing error with increasing n, many of the ex- 
perimental errors are not reduced with increase in the number of 
animals in the individual experiment, but only with increase of the 
number of independent experiments [5, 16]. 

These remarks are made in order to serve as a warning that the 
formulary calculated errors, sometimes referred to as “internal es- 
timates,” must be used with great caution, and that no conclusions 
regarding comparative assays that depend on the assumption that 
these estimates are measures of the actual error are reliable unless this 
assumption is checked independently by experiment. The formulary 
estimates are nonetheless frequently useful, even essential, serving 88 
а minimum baseline from which calculations can be made, and their 
evaluation is therefore illustrated here for the examples presented. 

Following are four examples illustrating the use of the minimum 
logit X^ method, in different situations such as are met in practice. 
The discussion of various points which accompanies the examples is to 
be read as part of the definition of the minimum logit X? method as here 
advanced. Following the examples are three appendix notes, which also 
are to be read as a definitive part of the present essay. 


EXAMPLE 1. GENERAL CASE; STRAIGHT LINE TRANSFORM; NUMBER 
s OF SIGNIFICANT FIGURES; CALCULATION OF X? 


, The use of the straight-line equation to represent the logistic func- 
on in the minimum logit X^ method is a particular example of а 
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method of fitting that is very old. If a function Y=f(x, 06,6 +++) is 
to be fitted, where the 6’s represent parameters, this method consists 
of finding some function Y' of Y which is a linear function of z and 
fitting Y" against z as a straight line. Practical texts on curve fitting 
formerly used this principle almost exclusively. 

It hardly can be doubted that one of the chief reasons for the wide 
use of this method is the opportunity that it affords for the efficient 
utilization of graphic analysis as an adjunct to algebraic treatment. 
In many situations, there is no more effective simple method of testing 
the fitness of a proposed function to a set of observed data than to 
plot the data and the function and to look at the fit. And if the function 
can be put in such a form that, if the function really fits, the plot will 
be linear, then there is an inestimable practical gain. One is able to 
discern systematic deviations from the function which reflect what may 
be an important departure of the observations from hypothesis, when 
the application of formal statistical tests may fail to do so. At the 
same time, the plotted graph serves as a “control chart” for the identi- 
fication of points that fall off the trend so far that they are to be 
suspected of being out of “statistical control” and to be discarded. Bliss 
and also Finney, though they make good use of available statistical 
tests, also discard points that appear clearly to be far off the trend— 
in my opinion, a sensible and statistically sound practice. Even so ad- 
vanced a mathematical treatise as that of Cramér notes that waves of 
the observations may be discerned in a graphical representation, which 
are not reflected in a significant X^ deviation, and Gaddum [27] makes 
the point specifically in connection with bio-assay. This particular са- 
pacity to judge fitness by eye, when the function is linear, is generally. 
taken for granted, but when reflected upon, appears to be a most re- 
markable psychologic phenomenon, to which, 80 far as I know, no 
study has been applied. There is no equal capacity to judge by eye 
from a graph whether a function is, say, exponential or parabolic, even 
as there is no instrument comparable to a straight edge that one can 
lay down among the plotted points, to judge whether they fall ae 
an exponential curve rather than a parabolic curve, and having Jai 
it down, to read on the instrument an estimate of the exponential 
parameters. It is no wonder that in older books of statistics, curve 


fitting is, almost by definition, the use of a straight-line transform іші 
the accomplishment of a straight-line fit. A great practical help i : 
use of graphic treatment is the availability of appropriately е 

printed graph sheets, and for the more ready use of graphie p m 
with the logistic function I have had printed two graph sheets or the 
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convenient plotting of the linear logit relation. These are illustrated in 
Figure 1 in connection with Example 1, to follow; the legend attached 
to that figure explains the method of using these sheets.” 

Table 4 shows the details of caleulation for a general case. Estimates 
may be given according to the following rule: Carry the estimated 
standard error of the estimate to two significant figures and the estimate 
itself to the number of decimal places given in the error so written. 

The same principle may be used to govern the number of significant 
figures which should be carried at various points in the computations. 
A sufficient number should be retained throughout, to insure that the 
estimates as finally set down be definitively determined. A sine-qua-non 
of any statistical procedure responsibly advanced for general use is 
that it be so defined that two workers with the same data will arrive 
at the same result, to the degree of precision retained in its final 
promulgation. 

"The number of significant figures retained in the computations of the 
following examples has been determined by trial, to insure that the 
modest standards of precision set down be fulfilled. However, no rigid 
rules applying to all cases can be given, and it is possible that some 


7 Logit scaled graph papers have been printed privately from time to time, among which should 
be mentioned the early one of Wilson [43]. 


> 


| Fra. 1, Plot of observations and fitted logit line for data of example 1, on 
logit graph sheets. These sheets can be purchased from the Codex Book Com- 
pany, Norwood, Massachusetts, 

In the upper figure (a) is illustrated the use of a sheet on which the abscissa 
is scaled arithmetically. Since the function is related to the logarithm of the dose, 
the logarithms are scaled on this co-ordinate. The practice is followed of first 
expressing the dosages as fractions of the ‘smallest aniount used (D) so that the 
first dose in this scale is always unity and its logarithm zero. The percentage re- 
sponse is scaled on the left ordinate. On the right ordinate is scaled the logit of 
the response; this is useful in various ways, as for instance when plotting the 
fitted line 7 —7.06z — 1.91, the values of 7 can be located directly without transpo- 
sition to percentages. 

Tn the lower figure (b) the same data are plotted on a sheet which has four 
logarithmic scales, to accommodate the sheet to various ranges of dosage. The 
initial dosage being unity, the smallest over-all scale is used which will accom- 
modate all the dosages in the experiment; in the present case, since the highest 
dosage is 3.92, it is the second scale. The dosages D’ are located directly on this 
scale, and the data plotted accordingly. If only a single range logarithmic scale is 
provided, to cover the widest range found in practice, as in most published papers, 
the advantage of the logarithmic scale is frequently offset by having the entire 
graph compressed to a small fraction of the sheet, thus negating the main purpose 
of securing a good graphic representation. 
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problems might be encountered in which the number of significant 
figures used in the examples would not be sufficient, while in others à 
smaller number would have sufficed. 

Rules for retaining significant figures differ in different laboratories. 
In routine calculations I maintain for intermediate calculations 6 
decimal places, except where more are necessary to have a'minimum of 
6 significant figures. The rule is followed “blindly” without adjustment 
for consistency at the various steps of the interim calculations. Adjust- 
ment is made in the final statement of results, in accordance with the 
principle stated above of retaining two significant figures in the es- 
timated standard error, and decimal consistency with this so far as the 
estimates are concerned. I have found this procedure much more satis- 
factory than attempting to adjust the number of figures specifically 
for the various particular steps of the computation. In the examples, 
the rules just described have been followed, except that a minimum of 
4 decimal or significant figures, rather than 6, has been maintained. 


Calculation of Х? 
The Pearson X* is given by 


riy (observed — expected)? | (12) 


expected 
In the present situation 
n=number of animals at = 
r- number of animals at т affected 
s=n—r, number of animals at т, not affected 
p —r/n, proportion of animals at z affected 
q=1—p=s/n, proportion of animals at 2, not affected y 
ф = "expected" value of p at г) obtained by inserting the estimates 
a, b, for о, B in the logistic function (1). ү 

й=1—$, “expected” value of 4 at 2 

jn = "expected" number of animals at z affected 

{п = “expected” number of animals at 2 not affected. 


Hence jn)? ( ау 
t- d ] (18) 
3543 Nd pu |. 
d | m in 
The X? can be evaluated directly from (13). However, it is easily shown 
that (18) is identically equal to 


т 
ош DEEA СМЕ УГУ (14) 
DU и” 
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Since (14) is easier to compute than (13), a direct calculation of X can 
appropriately employ (14). Applied to the present example, Table 5 
shows this calculation. 

What has been calculated is the X^ directly as defined. However, the 
logit X" is a very close approximation of the Pearson Х° and therefore 
X* can be obtained by calculating 


X'(logit) = У) npq(l — I)? = У nw(l — В), (15) 


Since for (15) we do not need to compute р from 1 or evaluate п/р, 
the replacement quantity npg =nw having already been calculated in 
computing the estimates, the evaluation of X^ from the approximate 
formula (15) is considerably easier than the computation from (14). 
The calculation of X* by this method is shown in Table 4, where it is 
incorporated in the calculation of the estimates themselves. As is seen, 
the value for X* computed in this way (1.39) is very close to that ob- 
tained by direct computation (1.40). 


TABLE 5—EXAMPLE 1 (cont.) 
DIRECT, CALCULATION OF Хз 


2 fen p i > (p-* x Ж (p-9* 


0.000 | 50} 0.120 |—1.9051 0.12953 | 0.00009 | 0.11275 443.5 | 0.0399 
0.164 | 48 | 0.338 |—0.7471 | 0.32145 | 0.00013 | 0.21812 | 220.1 | 0.0286 
0.202 |46 | 0.522 | 0.1566 | 0.53907 | 0.00020 | 0.24847 185.1 | 0.0537 
0.471 | 49 | 0.857 | 1.4205 | 0.80542 | 0.00266 | 0.15672 312.7 | 0.8318 
0.508 | 50 | 0.880 | 2.2819 | 0.90737 | 0.00075 | 0.08405 | 594.9 | 0.4462 


1.4002 


X'-14. 


* Obtained from antilogit Table 2 by linear interpolation, The value of p to the accuracy given in 


the table can be obtained for the logit with two additional decimal places, by linear interpolation, over 
the entire range of the table. 


EXAMPLE 2. EQUAL NUMBERS AT ALL DOSES j DOSAGE CONCEN- 
TRATIONS IN CONSTANT PROPORTION; ZERO OR 100 PER CENT 
: OBSERVATIONS 


If n, the number of animals used, is the same at each concentration, 
it can be seen from the normal equations (4) (5) that n can be elim- 
inated, and we may consider, for the purpose of estimation, that the 
value is unity at each dose, thus simplifying the calculations. 

In many bio-assay experiments, the dosages are made up by suc- 


ESTIMATING THE BIO-ASSAY WITH QUANTAL RESPONSE 581 


cessive dilutions in the same proportion, so that the ratio of the con- 
centration of each dose to the next smaller dose is constant. If the 
logistic function (1) is considered to hold in relation to the logarithm 
of the dose, rather than to the dose itself, then in the logarithmie meas- 
ure the doses will increase arithmetically. In such situations it is 
possible to *code" the dosage simply, in successive equally spaced in- 
tegers, a possibility that not only facilitates computation but increases 
its accuracy, since the logarithm itself will ordinarily be used only to 
а small number of decimal places and therefore will not always be pre- 
cisely correct. Suppose the constant of proportionality is k, and the 
lowest dose is symbolized D1; then the successive concentrations will 
be Di, kDi, Di - - -, k*Ds, and their logarithms will be log Di, (log 
Dı+log k), (log Di+2 log k), - + - , (log Di+s log К). If we code 2 as 


log D — log Di lo D 
np bcc RUE LL s od 
? log k бф 


the successive values of z will be 0, 1, 2, - - - , з, which is the coding to 
be used. Even if not all, but most of the dosages, progress in some con- 
stant proportion, it is still desirable to use such a coding, because of the 
greater precision of the resulting calculations. The example (Table 6) 
will illustrate the use of this facilitation in computation. i 

The example is taken from a study by Irwin and Cheeseman [30] 
which employed seven doses, but I have not used the observation of 
100 per cent at the seventh dose, which followed an observation of 100 
per cent at the sixth dose. The omission of the final observation of 100 
per cent mortality may be deemed arbitrary, but it is based upon the 
following considerations. 

The model used, aceqrding to which the 
function, can be only approximately correc à ; 
previously, but for many situations % may be considered Pr 
enough to serve as the working basis for obtaining the Ea es- 
timates. This is sound, however, only as à general appraisal; in some 


respects the model is more unrealistie than in others. In one respect it 


i i я ing to the logistic model (1), it is necessary 
to have of Не 100 per cent and a dose of zero 


to have an infinitely large dose for P —1 ; 
(log dose = — =) PO 0; this is unrealistic, of course. We xs dealing 
with all-or-none response in animals. It is characteristic in p. aa 
logic experience that one must increase the dosage to some aire 
dose before any animal will show the “а” response, and that E x 
a certain dosage all animals will show this response. Actually, | € 
less than that corresponding to the E.D. 50 by a ле 8 


“true” P's follow the logistic 
t, as has been remarked 
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amount may be quite innocuous, while the dosage above which all ani- 
mals respond may be larger than the E.D. 50 by only a small amount, 
and is never an inordinately large amount. So far is this true in many 
cases, that the potency is often defined as the “minimum lethal dose,” 
this being the dose below which none, and above which all, animals die. 

This sort of standardization is by no means always as foolish as on 
occasions it has been made out to be, as а personal experience taught 
me. Working with certain drugs, I found that the slope of the dosage 
mortality curve was so high that the zero and 100 per cent points (or 
their close approximations) covered a range of dosage so small that ex- 
periments could not be controlled effectively with dosage varied within 
that range. In a practical sense, the dose below which all true P's were 
zero and above which all true P's were 100 per cent was so narrow 


TABLE 6—EXAMPLE 2 


DOSES IN CONSTANT PROPORTION. EQUAL n (50) FOR ALL 
DOSES TOXIN BACTERIUM TYPHI MURIUM 
(FROM IRWIN AND CHEESEMAN [30])* 


Dose, 
IB. vt wlt wr 
D 

0.0025 | 0 0.1056 | —0.2104 02% 
0.125 1 0.1204 | —0.2186 | 0.1204 
9.25 2 0.2244 | -L0.1488 | 0.4488 
0.5 3 0.1716 | +0.2172 | 0.5148 
1.0 4 0.0900 | +0.1977 | 0.3600 
2-0 5 0.0099 | +0.0455 | 0.0495 


"Total e: iment omittinj i 74.00. 
тарі nt omitting observation at D —4.00. 


Zw = 0.7219 Zwz = 1.4935 У wl = 0.1802 
Zuz т.Хш 
Bo ea ac 
Sy = 2.0688 1-35 = 0.2406 
Z wat = 4.2499 = 1.7489 
Œ 02)/2 w = 3.0808 хха l4: 3728 
ake —25:- 1.1601 Zw(z—2) —1) = 1.3761 
ь-7ш@—(@-1) _ _ 2ш-Ь®шт __„ 
А-а: 711562 LEE 2.2044 
= —-Ё = 1.8584 log Dis = zs log 2 + log Di = 1.3553 D = 0.2266 
1 
w a ы 
#%/ = = 0.02770 
1 
8% ЕТ = 0.01724 зь = 0.13 
8% = sty + tst = 0.1015 за = 0.32 
1 
as s |) + (zs — #)%] = 0.02023 821, = 0.14 
a = — 2.20 + 0.32 PO i948 ти = 1.86 + 0.14 


Т= 1.192 —2.20 
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that it could be considered infinitesimal. To consider all observations 
of 100 per cent as random samples from true P's which are never 100 
per cent, as statistical theory of quantal response does, is quite in 
contradiction of the known biologie facts. Where theory and fact con- 
trast so violently, it is foolhardy to press the theory very hard, It is the 
considered opinion of the present writer that observations with zero or 
100 per cent should not be used at all—that when they occur, another 
experiment should be performed with different dosages at values where 
observations of zero or 100 per cent are very unlikely—but this is 
probably an extreme position that will not be generally acceptable. 
I may point out, however, that in the widely advocated Kürber [18, 
30] method of estimate of L.D. 50, if at two successive doses on observa- 
tion of zero per cent mortality is made, only the larger of the two doses 
is used, and similarly only the smallest of several consecutive doses 
which show 100 per cent mortality is considered, in making the calcu- 
lation. There seems to be something unreasonable in never using cer- 
tain observations in one good method of estimation and always using 
them in another. My suggestion, then, is to use at most one such 
observation at each extremity. 

I am, as just explained, disposed against using observations of zero 
or 100 per cent response in estimation in bio-assay. However, such ob- 
servations may occur, and provision must be made for utilizing them. 
For an observation p=0 or p=1, the corresponding logit is infinite, 
the weight ра is zero, while the weighted logit pq! approaches the limit 
zero as p0, or p—1. If with these observations, we use for pgl the 
limiting value zero (a dubious procedure mathematically), the observa- 
tions are effectively eliminated and hence actually are not used. One 
method for dealing with these observations is similar to that used in 
the iterative procedures of the probit method. A preliminary estimate 
may be made using all the observations except those of zero or 100 per 
cent. The value of “predicted” by this fit at the values of x corre- 
sponding to observations of zero or 100 per cent is used to replace 
observation, and a minimum logit X^ estimate is made using this уот 
ing observation. The use of this method mars the elementary simplicity 
of the minimum logit Х? estimate, when a zero or 100 рег cent observa- 
tion occurs, to the degree that it requires what amounts to one Ru 
tion. It should be noted, however, that only one “iteration” 1s require { 
and that the result is a definitive solution, which is quite a different 
situation from the one obtaining in the maximum likelihood arp ids 
cedures, where an undefined number of iterations 18 required А P» i. 
the solution is not definitive, if a graphic fit by eye was used for the 
original estimate. у 
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Another method of dealing with the zero or 100 per cent observation 
is to employ an old “empirical” rule, according to which one uses for 
zero a working observation 1/2n, and for 100 per cent a working ob- 
servation (2n — 1)/2n. In order to determine the relative merits of the 
two methods proposed, I performed some sampling experiments similat- 
ing а situation in bio-assay when zero and 100 per cent would be rela- 
tively frequent. In all cases I found that the error of estimate when the 
“empirical” rule was used was smaller than that obtained when the 
“iterative” scheme was used. Thus in the present situation the easier 
method is also the more precise, а situation which is similar to one 
found in comparing the minimum logit X^ estimate with the maximum 
likelihood estimate. The definitive procedure now for dealing with 
zero's and 100 per cent observations in the minimum logit X^ method is 
therefore to use the rule of substituting 1/2n for zero and (2n — 1)/2n 
for 100 per cent observations. 


EXAMPLE 3. COMPARATIVE ASSAYS; PARALLEL LINE PROCEDURE 


The following is an example in which we wish to estimate the toxicity 
or “potency” of a drug Т, to be tested in terms of another drug S (the 
“standard”); that is, we wish to answer the question, «How much 
more (or less) potent is drug Т than drug S?" The answer to this question 
will depend on definition, but one reasonable answer is based on the 
following rationale: Suppose that with each drug there is a definite 
relationship between dosage and percentage response, that is, that the 
greater the dosage the greater the response, but that these relationships 
are different for the two drugs so that for а given concentration the 
percentage response using T is different from the response using 9. If 
now we consider some definite percentage response and find that it 
requires a concentration Cs of the standard to produce say a 50 per 
cent effect, while it requires only one-third the concentration of drug 
T to produce this effect, we may reasonably say that Т' is three times 
аз potent as S. Suppose, however, that itis found that for a 75 per cent 
response using S it requires not one-third, but only one-fourth the con- 
centration of Т, then we should have to say that for this response level, 
drug T' is four times as potent as S. This will pose the dilemma that 
there is no unequivocal answer to the question of how much more 
potent is drug T than drug S. We shall have to say, “Drug T is three 
times as potent as S for an effect of 50 per cent and four times as potent 
аз S for a 75 per cent effect.” 

Let us turn to the logistic function and put the problem in terms of 
the logit representation. If the logistic relation of response is to the log 


ESTIMATING THE BIO-ASSAY WITH QUANTAL RESPONSE 585 


dose, the logit plotted against log dose will be a straight line. The 
horizontal distance between the two lines representing T and 5 respec- 
tively, at any value of the response, is the difference of the logarithms 
of the doses of T and S which produce that response, and thisis the 
logarithm of the ratio of the doses themselves. If the logit lines are 
parallel, the horizontal distance will be the same at all values of re- 
sponse, and the measure of the relative potency as the ratio of the 
dosage concentrations whieh produce the same response will be 
unequivocal, so far as response level is concerned. If they are not 
parallel, the relative potency will depend on the response level to which 
it is referred, and it seems reasonable under these circumstances to 
refer to the 50 per cent response point as a standard convention. 

It is a moot question whether the dose response curves of different 
drugs are in fact parallel in the sense referred to, that is, whether the 
ratio of dosage concentrations for equal response is the same for all 
response levels, in general, for most drugs, or ever. Too little investiga- 
tion with large enough numbers of animals has been performed in 
respect of this question, to provide a reliable general answer. To attempt 
to ascertain whether the lines are parallel, by applying statistical sig- 
nificance tests ad hoc to individual samples at hand, is futile, for with 
the numbers in the samples generally employed, the power of existing 
tests is so small that even if the lines are really far from parallel, the 
probability of a significant test result is very small. Certainly there is a 
good deal of evidence that in many cases the response curves of sim- 
ilarly effective drugs with the same animals are not far different from 
parallel, within the 10 per cent to 90 per cent response levels. This is 
perhaps sufficient to justify the general practice of making relative 
assays by means of “parallel assays,” the procedures for which will be 
presently described and illustrated. However, another view is permissi- 
ble. If we do not know or do not have good evidence that the dose 
response curves are parallel, it seems reasonable to make the assay by а 
statistical procedure which would be valid whether the lines are parallel 
or not. If the lines are in fact parallel, the ratio of the E.D. 50's of the 
two lines separately estimated is an estimate of the distance between 
the parallel lines only slightly less good (larger variance) than the 
estimate obtained by fitting the two lines on the assumption of paral- 
lelism, If the lines are in fact not parallel, then we should not fit parallel 
lines. Since the ratio of the concentrations at the 50 per cent response 
point is a good estimate for either assumption, and since it is very 027 
ous that response curves are in fact ever actually parallel ina litera 
sense, it would seem reasonable that we use the ratio of the E.D. 50’s 
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and not parallel assays. Occam’s razor, which requires the application 
of а minimum of assumptions, is still one of the soundest principles 
ever enunciated on which to base scientific procedure. To use the 50 
per cent point as the conventional level for comparative assays, even 
in the face of the possibility that the lines are not parallel, is arbitrary, 
but no more so than using the E.D. 50 as the index of potency for an 

_ individual drug. Two drugs equivalent at the 50 per cent point may 
not be equivalent at other response levels; yet this has not prevented 
the development of a vast statistical literature on the calculation of 
the E.D. 50 in which it is implied that this point is an acceptable con- 
ventional level for the measure of potency. 

In the following example the relative potency of two drugs is es- 
timated by fitting parallel lines, in order to illustrate the computa- 
tional steps for such a procedure, but it should not be taken to imply | 
that I am advocating this as obligatory in estimating relative potency. 

The basic principle of estimating two parallel logit lines is the same 
as that for a single line, that is, we minimize the logit X?. However, 
since there are two logit lines, 7, and 7,, to be fitted, and since the as- 
sumption of parallelism implies that the 8 parameters are the same for 
both lines, there will be three parameters to estimate, a, o;, and В, the 
estimates being represented respectively as a;, а,, and b. We shall 
minimize the total logit X^, ; 


X'üogit) = X пра — №: + У, пра — 17. 16) 
The normal equations are 


È np(l — 1) = 0, 
È npq(l — 1) = 0, 
Di пра — 1) + Харай-і)- У X pax 0 = 0, 
where 

1,=a,+bz, 

1,=a,+bz, 
and the other symbols are defined as previously. 

The'distance between the fitted parallel lines symbolized M is given 
by M=(a.—a,)/b; the ratio В of the potency of the test drug to 
standard is given by log В=М. 

Table 7 gives the details of calculation. 


EXAMPLE 4. RELATIVE POTENCY “4-POINT PARALLEL ASSAY” 


This is a widely used scheme for estimating the unknown potency of 
a drug T to be tested relative to a standard S, by making only two 
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TABLE 7—EXAMPLE 3 


PARALLEL LINE COMPARATIVE ASSAY. TOXICITY OF 
ROTENONE RELATIVE TO DEGUELIN 
(From Finney [22], pp. 68, 69) 


ROTENONE 
Concen- Propor- 
tration, | Log D | Deaths Total tion m А 
mg/liter т т no, died т nwl nwr 
D ^ | r/n=p 
2.6 0.415 6 50 0.120 5.2800 | —10.5200 | 2.1912 
3.8 0.580 16 48 0.333 | 10.6608 | - 7.4064 | 6.1833 
5.1 0.708 24 46 0.522 | 11.4770 1.0120 | 8.1257 
fric 0.886 42 49 0.857 6.0074 10.7506 | 5.3226 
10.2 1.009 44 50 0.880 5.2800 10.5200 | 5.3275 
о E тов = 27.1508 Z nul = 4.3502 
а = 2902 0.7015 Ta 20m „0.115 
> АП 422 3399 z "У molz = 12.1947 
(2 пог)/2 nw = 19.0450 (E nul)(Z nw2)/Z nu = 3.0557 
Znw(z—2) = 1.2049 Znw& -D(z —%) = 9.1390 
DEGUELIN 
Do LC C OCDE Е И нк ышы EE 
Concen- Propor- t 
tration, | Log D | Deaths то Чоп AU wl mor 


mg./liter т T died 


Уто = 26.3037 E nuz = 32.4341 
Z тюл qo Zt „олю! 
2 = ZO — 1.2289 
ЖО me y mule = 30.8740 
Znus = 411196 ' др тиз) nw = 28.4199 


(Enwz)1/Z nw = 39.8569 + T 
Deguelin Znw(z —@)* = 1.2507 Zw -Ñe -2 = 
rene E noiz — а): = 1.2049 пий -De —2) = 9.1990 
— z)? = 2.5516 EE nwt — D(z — 2) = 16.5931 
eer Enw — bE тох _ 4,4401 = - 7.2002 


EE nol =D =) 6 5099 о = 2" 
N nw 


bo Emea 
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TABLE 7 (cont.) 


М-Ішв-4--4 = 0.4337 В = 2.715 


1 
38.7052 


ate 
44 7 26.3937 


7 0.02584 


1 
4--.- в, = 
°T Ens tt 


= 0.03789 


1 


H 
а 73-5515 = 0.9919 ғ, = 0.63 


4% 
sty = ES іе, + etary + aola — 2, — Муз] 


1 


[0.02584 + 0.03789 + 0.3919(1.2280 — 0.7015 - 0.4337):] = 0.001588 


42.2890 
RS n 
8M = SlogR = 0.040 aR "um = 0.25 
R = 2.71 + 0.25 b = 6.50 + 0.63 


observations with each of the drugs. The known standard is used in 
concentrations, say Dı and D:=kD,; the unknown is diluted in the 
same proportion as the standard concentration. If the lower and higher 
concentrations of each of the drugs are coded respectively z—0 and 
2-1, the ratio E of the potency of the unknown to standard is given 
by log R=(ar—as)/b log k, where k is the ratio of the larger to the 
smaller dosage of the standard. The fact that the values of = are either 
zero or unity simplifies the summations required, so that a special 
format for the calculations is worth while, which is illustrated in 
Table 8. 


APPENDIX NOTE 1, DIFFICULTIES IN PRACTICE WITH ITERATIVE 
METHODS 


The fact that an iterative procedure is needed for the probit solution 
with maximum likelihood results in а number of practical disadvan- 
tages. In the first place, a formidable amount òf rather involved com- 
putation is required, and for this reason alone а number of protests 
have been issued against the method and alternative simpler methods 
have been proposed in order to alleviate the computational labor [14, 
32, 33, 35, 36, 41). But equally or more important is a consequent 
lack of precision of the estimates as achieved in practice—and this 
appears not to have been sufficiently considered. 

In general practice, where only one cycle of iteration is used, the 
method does not yield a definitive solution , 80 that two workers using 
the same data will not necessarily obtain the same estimate. To insure 
definitiveness, strict rules must be laid down , to continue the iterations 
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TABLE 8—EXAMPLE 4 


DECAMETHONIUM BROMIDE, KNOWN CONCENTRATION AND 
UNKNOWN TO BE TESTED EXPERIMENT OF DEWS AND 
BERKSON, 24 ANIMALS AT EACH DOSAGE 


Dose, 
Drug mg./ml. T р w* ТЫ 
D 
D 6 | 0.250 | 0.1875 | —0.2060 
Tested 
1.5 D 16 | 0.667 | 0.2221 0.1543 
0.016 7 | 0.292 | 0.2067 | —0.1831 
Standard 
0.024 21 | 0.875 | 0.1094 0.2128 


* Fi Table 3. 
Dir nce = 0.4096 Zwrl = wh t wd = O 0517 Z wrz = w = 0.2221 
У wg = ша +w = 0.3161 E wsl = wih + wk = 0. У шх =w = 0.1094 
2 wlz = wails + wds = 0.3671 

Хш шг (шһ + wiliws 


(uis + wddw _ 0.01775. 
Zv wi + ws + ws + we AU 18 
Zut -De -a = Buls — Ре -о.зыв 
z уз 4 we 
хае 00а О аа ан 00 


Zw (= — 2) _ 0.3848 
аА Ae н Ау, 
Зи 2 0.1732 
Zw —-bZwuz Хим —bZwrz _ —0.0517 -2.2217(0.222) _ _ 1,3309 
а- ar> = ange i 


2% ie D 7(0.1094) 
У ові —b2 wsz _ 0.0297 - 2.221700. = 0.0780 
og = Sea Е; 0.3161 
Pe or magi _ — 0.2952, log R =M log 1.5 = — 0.05198 R = 0.8872 
1 1 1 
3 4 y ziS incid, ou GANE 
м--1- "reiso noan 70107 448-556; МВ 2 
1 1 =0.49 
ое 0.2408 а =0. 
8% = Sule a? 240.132) $ 01004 
.2221 wsz 0.1094 1 
22 Dut Zwrz _ 0.2221 | 4 5422 28 = Улов “0.3161 diti 


Xo OT xa; 00.4006 
d 
at = > ар +sta" g + 828 — 27 — My] 


““-. DU [0.1017 + 0.1818 + 0.2406(0. 009821) ] = 0.04778 
L2 
E = 0.07865 
sm = 0.2186 Slog R = 8M log 1.5 = 0.03850 oR Toge PRU 


Е = 0.887 + 0.079 b -2.22 + 0.49 
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until constancy is attained in the estimates, to a specified number of 
significant figures [1, 29]. Probit solutions which have been published, 
even in authoritative texts advancing the method, and in important 
pharmacologie standardizations which employ the probit method, fre- 
quently cannot be checked arithmetically as maximum likelihood es- 
timates, even to the number of decimal places to which they are carried 
in final publication. Different estimates applying to the same set of 
data have been published by different authors, and even by the same 
author [12, 19, 34], the differences reflecting the residual of lack of defi- 
nition in the original graphic solution, not eliminated because an in- 
sufficient number of iterative cycles have been accomplished. 

If definitiveness of the estimates is to be achieved, even the decision 
as to what tables should be employed and the manner of their use is 
something of a problem. For the example illustrative of the probit 
method used by Fisher and Yates [26], there are given two sets of 
estimates. One, obtained with a single cycle of iteration, is b=.68906, 

‚ L.D. 50=6.618; regarding the other, the authors say that “a much 
more precise fit gives 6.609 . . . for the 50 per cent point . . . the more 

. exact value (for the slope) is .7126.” No information is proferred as to 
the number of cycles of iteration required for the “more exact” es- 
timates, nor is a precise description given of how the tables of the vol- 
ume in which the example is incorporated were used. It is a novel sta- 
tistical doctrine that is reflected in this example, where it appears that 
there are two kinds of maximum likelihood estimates, one for ordinary 
everyday use and one for statistical Sundays when we use “more 
exact” estimates. Had the authors given b =.7, L.D. 50 — 6.6, as the first 
estimates, and b —.7126, L.D. 50 = 6.609, as the second, both sets could 
intelligibly be regarded as correct maximum likelihood estimates, the 
last more precisely determined than the first. But in respect to the 
estimates b — -68906, L.D. 50=6.618, and b= .7126, L.D. 50 =6.609, how 
can both sets be correct maximum likelihood estimates? 

For the example in Finney’s text [24], used to illustrate the computa- 
tions of probit analysis, he gives b =4.176 + 0.466, and then says, “The 
slope has been altered from its provisional value of 4.01 by an amount 
equal to about one-third of its standard error, and if an accurate value 
of b were particularly required (sic)? a further cycle of computations 
Would be desirable; the next value obtained for b is, in fact, 4.196, 
the alteration being only 4% of the standard error." The value given, 

3 Garwood [29], ii пи" obtai i ifferen: or b (.7128) and 
ОО см m ci = 

° The value already caloulated involved computations with the use of 7 significant figures. 
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4.196, doubtless is obtainable, if the tables in Finney's text are used in 
some particular way, but I have not been able to reproduce it exactly, 
and when I used the W.P.A. tables of the normal deviate [39], employ- 
ing methods which appear to me to be correct, iteration toward a 
maximum likelihood estimate yielded a value for b which is smaller, 
not larger, than 4.176. 

The probit maximum likelihood method need not have been devel- 
oped along lines which have led to the present chaotic situation, in 
which published estimates in official standardizations are not definitive 
and cannot be checked for the data on which they are based. In the 
article in which the mathematical development of the maximum likeli- 
hood estimate of the probit equation was first set forth, Irwin and 
Cheeseman [31] obtained the “first approximation,” not by a-graphical 
fit accomplished by eye, but by using the observed probits, and 
weights z?/pq obtained from the observations. Had the procedure ad- 
vanced by Irwin and Cheeseman been adopted as standard, then a 
notation with a published estimate, indicating the number of iterative 
cycles which had been accomplished, would have rendered the estimate 
reproducible, even if on some occasion it still might be criticized as 
not being the maximum likelihood estimmte, because an insufficient 
number of iterative cycles had been employed. However, the procedure 
of Irwin and Cheeseman was abandoned in favor of the one employing 
a graphic fit by eye for the first approximation. 

Referring to the calculations of the probit estimates in ten repeated 
experiments, by three iterative methods, Method III being the maxi- 
mum likelihood estimate, Irwin and Cheeseman say [31], “Starting with 
the probits corresponding to the observed mortalities... the con- 


vergence is not very rapid. About 6 successive approximations are 


needed to get accuracy tò 2 significant figures. . . . Sample D needed 8, 
ly for Methods I, II, and III.” 


11, and 9 approximations respective 

The procedure of “probit analysis" as widely advanced and prac- 
ticed, consisting of a single cycle of iteration based on a provisional 
graphical estimate, actually is not a maximum likelihood estimate, but 
only a somewhat modified graphical solution. Since it is a step in the 
right direction toward the maximum likelihood estimate, perhaps it is 
entitled to the designation “likelihood estimate.” if one or two more 
iterations are performed, it could be called a “very likelihood estimate”; 
if as many as 9 iterations are accomplished, as in the example from 


Irwin and Cheeseman referred to above, an аи likely voc 
h i 2 ‚ A really mathematical maximum liked 
Moped М ly attainable, but this 


hood estimate in the present circumstance is rare 
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estimate appears to be held so noble an objective that perhaps we 
should be contented only to aspire to achieve it. However, it must be 
remembered that it is solely to the actual maximum likelihood es- 
timate that the optimum properties pertain, which Finney insistently 
claims for “probit analysis." These optimum properties do not refer 
to a “likelihood estimate," nor even to a “practically good enough? 
maximum likelihood estimate. 


APPENDIX NOTE 2. OTHER ESTIMATES OF THE LOGISTIC PARAMETERS; 
THE MAXIMUM LIKELIHOOD ESTIMATE 


In the investigations of the present author, it has been found that 
there is at least one other estimate of the parameters a, @ of the logistic 
funetion which has smaller sampling error (mean square error) than 
the minimum logit X^ estimate. This is the “Blackwellized” minimum 
logit X^ estimate. The Blackwellized estimate is the expectation 
(weighted mean) of the estimates corresponding to the samples of а 
sufficiency group, which is the group of samples for which the sufficient 
statistics [in the present case (Znp, Znpz)], have the same value. By 
an extension of Blackwell’s theorem [9] to biased estimates, these 
estimates have a mean square error equal to or smaller than the original 
estimates, and their bias, if any, will be the same as that of the original 
estimates. In the present case, for each of the minimum X° estimates, 
the m.s.e. is less than, rather than equal to, the m.s.e. of these es- 
timates before Blackwellization. This is a consequence of the fact 
that with the X^ estimates, the estimates from the samples in the suf- 
ficiency group are not identical. The maximum likelihood estimate, on 
the contrary, while it is sufficient, necessarily is identical for all 
samples in the sufficiency group, and therefore is unchanged by Black- 
wellization. : 

The Blackwellized estimates have a characteristic in common with 
the maximum likelihood estimate, in that the estimate is the same for 
all samples having the same value for the sufficient statistics; this im- 
plies that for these estimates, the sufficient statistics Упр and Znpz 
of the sample uniquely determine the estimates. It is therefore possible, 
in principle, to prepare a table with two-way entry, corresponding to 
Znp апа Zmpz, which gives the Blackwellized estimates. For the 
Blackwellized minimum logit X* estimate this would involve many 
calculations and а great deal of arithmetical labor.!? 


191 have, however, calculated the Blackwellized estimate for several special cases, and confirmed 
by direct computation that it hae a smaller m.s.e. than either the maximum likelihood estimate or the 
minimum X? estimates, 
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The maximum likelihood estimate has a larger mean square error 
than the minimum logit X^ estimate and therefore cannot be con- 
sidered the best estimate available for the logistic parameters.. If there 
is no practical reason to the contrary, such as difficulty of computation, 
it is certainly a sound principle that the best estimate be used. How- 
ever, the maximum likelihood estimate is generally а very good es- 
timate even if it is not always the best, and if there are cireumstances 
in which it is easier to obtain than the minimum logit X^ estimate, 
it should not be barred from good statistical practice. The maximum 
likelihood estimate of the logistic function does in fact have some char- 
acteristics which, with necessary preliminary work, make it obtainable 
even more easily than the minimum logit X^ estimate. 

In general, it is possible to compute the maximum likelihood estimate 
of the logistic parameters by iterative procedures using logits, anal- 
ogous with those used for obtaining the maximum likelihood es- 
timate of the parameters of the integrated normal curve, using probits 
[6, 8, 26]. When obtained in this way, the estimates of the logistic ра- 
rameters have the same practically unsatisfactory character as the 
probit or other iteratively obtained estimates—that is, they are not 
accurately or even definitively obtained ‘unless a sufficient number of 
iterative cycles are accomplished to meet а specified degree of pre- 
cision. However, the maximum likelihood estimate of the logistic func- 
tion does not always require such iterative procedures. For instance, in 
the case of three equally spaced doses z with the same number of ani- 
mals at each of these, Wilson and Worcester [45] have provided a 
cubic equation in terms of two easily computed statistics of the obser- 
vations, the explicit solution of which yields the exact maximum 
likelihood estimates. It is not arithmetically easy to obtain the solution 
of this equation, but these authors have presented an approximation 
which can be easily solved, that gives the correct maximum likelihood 
estimate to five or six significant figures! For more than three or per- 
haps four doses, it is impossible to develop explicitly soluble Бам 
such as Wilson provided for three doses. However, following is de- 
scribed a scheme which could provide the solution directly, exact to i 
desired number of places, for any specified arrangement of dosage ani 
number of animals. { 

The normal equations for the maximum likelihood estimates of the 


parameters of the logistic function are 
Упр = Ут 
Хт«- Linde 


594 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1953 


where n is the number of animals, p the observed proportion affected, 
and ё —1/(1-4-e-(??), a and b being the maximum likelihood estimates, 

It may be observed at once that the estimates are unique functions 
of two easily computed statistics, Znp and Znpz (and if the n’s are 
equal, of Zp and Zpz), which are in fact the sufficient statistics for 
these parameters. Hence, as it has been mentioned is true for the 
Blackwellized estimates, it is possible to prepare a table with the 
two-way entry Zp and Zpz (considering the case with equal n) which 
will provide the maximum likelihood estimates. Now, while it appears 
at present that to make the necessary computations for the Black- 
wellized estimates would involve a prohibitive amount of arithmetical 
work, the maximum likelihood estimates can be provided by а nomo- 
gram that is not difficult to construct. Consider а graph sheet on the 
co-ordinates of which are scaled Хр and Zpz. For any pair of values а, 
b; which are the maximum likelihood estimates of some а, В, and with 
a specified arrangement. of dosages, the values of Zp and Zpz corre- 
sponding to these estimates are defined and given by Zp and Zjz, 
where р corresponds to the logistic function with «=а and B=b,. The 
values of Z$, 2x correspond to a point on the graph. Now, if we 
change b but not a, so that the estimates are:a;, b», another point will 
be located corresponding to the same value of a as the first anda dif- 
ferent value of b. In this way a series of points ( Zp, Zpz) сап be located 
which are the locus of the iso-a values of the maximum likelihood 
estimates of o; all samples with maximum likelihood estimate a, have 
their (Zp, Zpz) values located on this line. Hence if the Zp, Хрг of 
any sample fall on this line, a; is the maximum likelihood estimate of о. 
In the same manner the iso-a lines for other values of the estimates of 
а and В are located, and on the resulting nomogram the estimates can 
be read directly, corresponding to the value of Zp and Zpz of the 
sample; the accuracy of the estimate will be limited only by the pre- 
cision with which the graph can be read, and therefore the scale on 
which the nomogram is constructed. In a similar manner one сап con- 
struct à nomogram from which one can read the maximum likelihood 
estimate of 8 or of the E.D. 50. An example of such & nomogram 
giving the estimates of the E.D. 50 corresponding to the situation of 
four equally-spaced doses and equal number of animals at each is given. 
in Figure 2. It is planned to construct additional nomograms corre- 
sponding to three, five, six, and seven doses. 

Worcester and Wilson [44] have already provided a table giving the 
maximum likelihood estimates calculated from Wilson’s equation 
referred to previously, for the case of three equally spaced doses with 
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Fre. 2. The nomogram in the present figure i 
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equal n at each, in terms of the statistics A —2(pid-pz-pi)-3 and В 
= рз— рі, where pi, p; and p; are the observed proportions respectively, 
at doses scaled —1, 0, +1. It is seen that А —2Zp з and B= Zpz, 
80 that their table may be viewed as a special case of the general one for 
which the nomograms are being prepared. 


APPENDIX NOTE 3. DEFINITION OF LOGIT 


The use of the straight-line transform, which I baptized as the “logit” 
in 1944 [7], is found in early articles which utilized the logistic function, 
an example being its use by Von Krogh (1916) in illustrating the fit 
of his logistic law of hemolysis [42]. This is not surprising, since the use 
of the straight-line transform method of fitting a curve is very old and 
elementary. The first published table of logits (though of course not by 
that name), so far as I know, is that of Yule [47] (1925). In 1944 I 
published a nomogram [7] from which logits and antilogits can be read, 
in which I inadvertently reversed the sensible convention of attaching 
a sign to the logit by which an increase of logit corresponds to an in- 
crease of antilogit, a gaucherie I later corrected [6]. In March, 1950, 
Tissued an extensive table of logits, with weights and working values, 
for the facilitation of fitting the logistic function by (a) maximum likeli- 
hood, (b) minimum (Pearson) Х?, ог (c) “least squares,” published at 
private expense and distributed widely for trial, but which has been 
withheld from general circulation, awaiting the results of investigation 
of the relative properties of these estimates. 

In 1947 Finney published a table [21] under the title “Transformation 
of Percentages to Logits,” in which are given not the logits l as I de- 
fined them, but instead a quantity 0.5/+5. This alteration is of the 
species effected in the normal deviate of Galton and Shepard [28]., 
when the number 5 was added to it, in order to create *probits"—a 
change the wisdom of which has been widely questioned (for example 
discussion by Fieller, Irwin in reference) [21]. However, that modifica- 
tion was comparatively simple, and the designation of the original 
authors was not used for the substitute quantity. A later publication 
by Finney [20] contains similar, more extensive tabulations, which 
resemble my 1950 tables except in respect to the alteration referred 
to. That alteration destroys the natural mathematical symmetry of 
the logit, makes tabling twice as long as necessary, results in excessive 
difficulty of computation because of the large numbers involved, and 
introduces arbitrary constants into, and therefore confuses, all mathe- 
matical developments involving the logistic function. Whatever are 
the putative merits of Finney's quantities, they should not be referred 
to as logits, 
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CRITICAL VALUES OF THE LOG-NORMAL 
DISTRIBUTION 


Jack MosSHMAN 
Oak Ridge National Laboratory 


INTRODUCTION 


common statistical problem is that of testing a null hypothesis us- 
А ing a statistic drawn from some unknown distribution. The gen- 
eral configuration of the probability density function is known from 
empirical evidence. It is suggested that for certain applications, the 
logarithmic-normal distribution be used to approximate the unknown 
distribution by equating the first three moments. It would then be 
convenient to have a table of critical values of the log-normal distribu- 
tion, standardized for the first two moments and tabulated for various 
values of the skewness. 


PEARSON TYPE III DISTRIBUTION 


The Pearson Type III distribution is the only three-parameter dis- 
tribution whose integral hag been extensively tabulated and is gen- 
erally available. There are two important differences between the log- 
normal and Type III distributions in spite of their superficial similarity. 


"These differences may be exhibited as follows: 


Criterion Type III Log-Normal 
Points of Inflection Equidistant from mode Distances from the mode 
vary with the skewness, 
but differ for non-zero 
skewness. 
High Contact Not present at finite end for Always present 
large skewness; lower part ” 
of curve may have nega- 
tive curvature. 


` APPLICATION 


The use of the normal distribution in applications where the coeffi- 
cients of variation is large, presents many difficulties. Observed values 
more than twice the mean would then imply the existence of observa- 
tions with negative values. Frequently this is a logical absurdity. The 
use of the logarithmic-normal distribution has been investigated as 8 
possible solution to this problem [2, 6, 8, 10, 11, 12, 13, 18, 22]. 

Іп a review of the literature Gaddum [5] found that the log-normal 
distribution could be used to describe: 
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(а) The threshold of sensation; 

(b) The size of silver particles in а photographie emulsion; 

(c) The sensitivity to drugs; 

(d) The survival time of insects treated with disinfectants; 

(e) The average size of the different species in each of various phylogenetic 


groups; 
(f) The number of plankton caught in different hauls of а net; and 
(g) The amount of electricity used in medium-class American homes. 


A non-exhaustive review of the literature revealed many other ap- 
plications. Yuan [22] found an excellent fit to weights of female stu- 
dents by the log-normal distribution. Kolmogoroff [15], Halmos [9], 
and Kottler [16] discussed the applicability of this distribution to the 
distribution of sizes of small particles. Epstein [4] derived the log-nor- 
ma] distribution as the asymptotic distribution of particle sizes result- 
ing‘from breakage processes. Wicksell [20] applied the log-normal dis- 
tribution to graduate the frequencies of age upon marriage of bachelors 
and spinsters. Sentence length of various authors was fitted by the log- 
normal distribution by Williams [21]. Application of the log-normal 
distribution to economic data was made by Gibrat [7] and to agricul- 
tural data by Cochran [1]. Recently, Krige [17] applied the log-normal 
curve to the distribution of gold values in the mines of the Witwaters- 
rand. Cureton [3] suggested that the log-normal distribution be used 
to approximate the distribution of means of samples from a finite popu- 
lation in one type of psychological test-item analysis. 


CRITICAL VALUES 


In many applications, it would be convenient to have а table of eriti- 
cal values of the log-normal cumulative distribution corresponding to 
specified values of the parameters of the distribution. { 

The log-normal probability density distribution may be written 


and the cumulative distribution function is then 


Ға) = y “Та, С) 


e upper rather than 


then a represents th: 
where we assume b 7 0. If b «0, the: р te may be ВЫ 


the lower limit of the distribution and all statemen 
modified. 
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The parameters a, b, and c are related to the mean (и), variance (02), 
and skewness (оз) as follows: $ 


n = boi? + а 3 
o? = (0 — 1) (3) 
os = + (o — 1)'/*(o + 2), 


where w=e*. The sign of аҙ is chosen to agree with that of b. 
If in (1) we let 


z—oO 


1 
t = — log , (4) - 
£ 4 


then #18 distributed normally with zero mean and unit variance. Solving ) 
(4) for z 

ж = be + a, (5). 
and solving (3) for b and а 1 


} а 
ве Ds 
с 


a= u= boi = y- —. 
Be Saye 


Substituting from (6) into (5), 


„т с 
T perpe Т^ ulis 
ш — 1 
т = (our с + м. 
Then from (7) 
PW ыз ® | 
с (о — 1)! | 


where 7 may be considered a standardized log-normal variate in terms 1 
of the unit normal deviate t, о and c, but w and c are each expressible 
in terms of as. If tg is defined by 
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then 

о-в — 1 
(o = 1)! i 
Thus to obtain , say, the 5 per cent critical value of the upper tail of 

the log-normal distribution for o; —2, one first solves the equation 


в = (10) 


o^ + 30? — (4+ a5? = 0, (11) 
for w, which is the only real root. Knowing о, 
с = vlog о, 


t.o = 1.644854, and т. may be determined from (10). Finally, from 


(8) 


т = pt 7.6. (12) 


Table I contains values of rg for 8=.005, .01, .025, .05, .10, .90, .95, 
.975, .99, and .995 for as=0(.05)3.00. The computations were per- 
formed with punched card equipment and rounded to three decimal 
places.! More extensive hand calculations were used to smooth out sec- 
ond differences where necessary. The tabular values are believed cor- 
rect to within one to two digits in the last decimal place. Three point 
Lagrangian interpolation [19] may be used to give similar accuracy for 
intermediate values of аз. 


EXAMPLE 


A standard laboratory procedure consists of submitting 8 certain 
compound in crystalline form to a grinding process. After 2 minutes 
the average diameter of the pieces is determined from a sample of 10, 
selected at random from the ground pieces. Considerable experience 
shows that the means are well described by a log-normal distribution 
with mean, 5.76 mm., standard deviation .81 mm. and skewness .20. 

A new grinding process is now introduced which may have the prop- 
erty of displacing the previous distribution to the left which is an in- 
dication of greater efficiency, In any event the shape of the distribution 
will remain invariant. A sample of 10, selected from the result of Я. 
minute application of the new process, has a mean diameter dne 
mm. Does this represent & significant, departure from the estab! 
average for the older process? 


ч, same тр are available 
1 A limited number of tables for аз =0(01)3.00 to four decimal places and the same тд 
‘upon request to the writer. 
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The problem reduces to determining the probability that т, from 
equation (8), will have as extreme a value as that noted. Since the shape 
of the curve is invariant c —.81 and аз=.13. One finds 


_ 4.45 — 5.76 
а] 


In Table I, for ез = .20, it is seen that — 1.617 lies between the tabulated 
values —1.586 and —1.865 which corresponds to values of 8 equal to 
.95 and .975 respectively. Hence in from 2.5 per cent to 5 per cent will 
one obtain a mean value of 4.45 or less. One has then good reason to 
conclude that the new process is more efficient. If, on the other hand, 
one ignored the skewness present, and used the normal curve then one 
finds that a value of {= —1.617 would occur by chance between 5 per 
cent and 10 per cent of the time. This may be seen when a;=0 in Table 
I which reduces to percentage points of the normal curve. 


T 


097701617; 


COMMENT 


The critical values tabulated apply to cases where о and оз are 
known. In many applications there will exist a fund of experience which 
will insure this. Frequently, however, опе will have not с and аз, but 
estimates s and аз. If s and аҙ are based on small samples, the use of 
Table I may well be invalid. 
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. TABLET ) 
CRITICAL VALUES ОЕ LOGARITHMIC-NORMAL DISTRIBUTION 3 


Skewness тло 7.05 т.025 T.U т.005 
(оз) T.90 т. т.915 т. 7.995 
0.00 1.282 1.645 — 1.960 2.326 2.576 
0.00 —1.282 —1.645-- -1.960 —2.326 —2.576 И 
0.05 1.287 1.659 1.984 2.363 2.628 
`0.05 —1.276 —1.631 —1.936 —2.290 —2.529 
0.10 1.292 1.673 2.007 2.400 2.071 
0.10 —1.270 —1.616 —1.912 —2.253 —2.483 
0.15 1.296 1.686 2.030 2.437 2.719 
0.15 —1.264 -1.601 —1.889 —2.217 —2.438 
0.20 1.300 1.699 2.053 2.474 2.767 
0.20 —1.258 —1.586 —1.8654- -2.181 —2.393 
0.25 1.304 1.712 2.076 . 2.512 2.816 
0.25 —1.251 —1.571 —1.841 —2.146 -2.849 
0.30 1.307 1.724 2.099 2.549 2.865- | 
0.30 —1.244 —1.556 —1.817 —2.111 —2.805— | 
0.35 1.310 ‚ 1.786 2.121 2.586 2.914 
0.35 —1.237 —1.540 —1.794 —2.077 —2.263 
0.40 1.313 1.748 2.142 2.622 2.963 
0.40 —1.229 —1.525+ -1.770 —2.043 —2.221 
0.45 1.315+ 1.759 2.164 2.659 3.012 
0.45 —1.222 —1.509 —1.747 —2.009 —2.180 
0.50 1.317 1.770 2.185 — 2.695-- 3.061 
0.50 —1.214 —1.494 —1.724 —1.976 —2.140 
0.55 1.318 1.780 2.205-- 2.731 3.109 
0.55 —1.206 —1.478 —1.701 —1.944 —2.101 
0.60 1.319 1.789 2.2254- 2.707 3.158 
“0.60 -1197 — —1.463 -1.678 -1.012 -2.062 
0.65 1.320 1.799 2.245 — 2.802 3.206 
0.65 —1.189 —1.447 —1.656 —1.881 —2.025+ | 
0.70 1.320 1.807 2.263 2.836 8.255- | 
0.70 -1.181 -1.432 -1.634 -1.851 —1.988 | 
0,75 1.320 1.816 2.282 2.871 3.302 
0.75 —1.172 —1.417 —1.612 —1.821 —1.953  — 
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TABLE I—(cont.) 


T.025 


Skewness тло 7.08 
(оз) T.90 T.95 т.а 
0.80 1.320 1.824 , 2.300 
0.80 —1.164 —1.401 —1.590 
0.85 1.319 1.831 2.317 
0.85 —1.155—  —1.386 —1.569 
0.90 1.318 1.838 2.334 
0.90 —1.146 -1.871 -1.549 
0.95 1.817 1.844 2.850 
0.95 —1.138 —1.357 —1.528 
1.00 1.316 1.850 2.366 
1.00 —1.129 —1.842 —1.508 
1.05 1.314 1.856 2.381 
1.05 —1.120 —1.328 —1.489 
1.10 1.312 1.861 2.3954- 
1.10 —1.112 -1.813 -1.469 
1.15 1.309 1.865-- 2.409 
1.15 —1.103 —1.299 —1.451 
1.20 1.307 1.870: 24423 
1.20 —1.094 —1.286 -1.432 
1.25 1.804 1.873 2.436 
1.25 —1.086 -1.272 -1.414 
1.30 1.301 1.877 2.448 
1.30 —1.077 —1.259 -1.897 
1.85 1.298 1.880 2.460 
1.35 —1.069 —1.246 —1.379 
1.40 1.294 1.883 2.471 
1.40 —1.060 —1.233 —1.362 
1.45 1.291 1.885-- 2.482 
1.45 —1.052 —1.220 —1.846 , 
1.50 1.287 1.887 2.492 
1.50 —1.044 —1.208 —1.330 


7.01 
T.99 


2.904 
—1.792 


2.937 
—1.763 

2.970 
—1.735 — 

3.002 


` —1.708 


3.033 
-1.682 
3.064 
—1.656 
3.094 
-1.681 
3.123 
—1.606 
3.152 
—1.582 
3.180 
-1.559 
3.207 
—1.537 . 
3.234 
—1.515 + 
3,260 
—1.493 
3.285+ 
-1.472 
3.310 
—1.452 


607 


T.008 
7,9% 


3.350 
—1.918 
3.396 
-1.885-- 
3.443 
—1.852 
3.489 
—1.820 
8.534 
—1.790 
3.578 
-1.759 
3.022 
—1.730 
3.006 
-1.701 
3.708 
—1.074 
3.750 
—1.047 
3.791 
—1.021 
3.831 
—1.596 
3.871 
-1.571 
8.010 
-1.548 
3.948 
—1.5254- 
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Skewness тло 7.05 ur т.т 7,0% 
(оз) T.90 T.» т.т т.н 7.996 
1.55 1.283 1.889 2.501 3.334 3.985 + 
1.55 —1.036 —1.195—  —1.314 —1.432 —1.502 
1.60 1.279 1.890 2.511 3.357 4,022 
1.60 —1.028 —1.183 —1.299 —1.413 —1.481 
1.65 1.275-- 1.891 2.519 3.379 4.057 
1.65 —1.020 —1.172 —1.284 —1.395-- -1.460 
1.70 1.271 1.892 2.528 3.401 4.092 
1.70 —1.012 —1.160 —1.269 —1.377 —1.439 
1.75 1.267 1.893 2.536 3.423 4.127 
1.75 —1.004 —1.149 —1.255—  —1.359 —1.420 
1.80 1.262 1.893 2.543 3.443 4.160 
1.80 —0.996 —1.138 —1.241 —1.342 —1.401 
1.85 1.258 1.893 2.550 3.463 4.193 
1.85 —0.989 —1.127 -1.227 —1.325—  —1.382 
1.90 1.253 1.893 2.557 3.483 4.224 
1.90 —0.981 —1.116 —1.214 —1.309 -1.864 
1.95 1.248 1.892 2.563 3.501 4.256 
1.95 —0.974 —1.106 —1.201 —1.293 —1.346 
2.00 1.244 1.892 2.569 8.519 4.286 
2.00 —0.967 —1.096 —1.189 —1.278 —1.329 
2.05 1.239 1.891 2.574 3.537 4.316 
2.05 —0.960 —1.086 —1.176 —1.263 -1.813 
2.10 1.234 1.890 2.579 3.554 4.344 
2.10 —0.953 —1.076 —1.164 —1.249 —1.297 
2.15 1.229 1.888 2.584 3.571 4.373 
2.15 —0.946 '—1.067 --1.158 -1.235-- -1.281 
2.20 1.224 1.887 2.589 ° 3.586 4.400 
2.20 —0.939 —1.057 —1.141 —1.221 —1.266 
2.25 1.219 1.886 2.593 3.602 4.427 


2.25 —0.932 —1.048 -1130 -1.208 -1.252 


Skewness тло т. 
(оз) T.90 7.95 
2.30 1.215- 1.884 
2.30 —0.926 -1.039 
2.35 1.210 1.882 
2.35 —0.919  —1.030 
2.40 1.205— 1.880 
2.40 —0.913  —1.022 
2.45 1.200 1.878 
2.45 —0.907 —1.013 
2.50 1.195— 1.876 
2.50 —0.901 -1.005- 
2.55 1.190 1.874 
2.55 —0.895-- -0.997 
2.60 1.185— 1.871 
2.60 —0.889 -0.989 
2.65 1.180 1.869 
2.65 —0.883 —0.981 
2.70 1.175— 1.866 
2.70 —0.877  —0.974 
2.75 1.170 1.863 
2.75 —0.871 -60.966 
2.80 1.165-- 1.861 
2.80 —0.866 —0.959 

\ 2.85 1.160 1.858 

j 2.85 —0.860 -0.952 
2.90 1.155+ 1.855- 
2.90 -0.855: -0.945-- 
2.95 1.150 1.852 
2.95 —0.850 -0.938 
3.00 1.146 1.849 
3.00 -0.845-- —0.931 
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7.035 
Т.9т5. 


2.597 
—1.119 
2.600 
—1.108 
2.604 
—1.098 
2.607 
—1.088 
2.610 
—1.078 
2.612 
—1.068 
2.614 
—1.059 
2.617 
—1.049 
27618 
—1.040 
2.620 
—1.031 
2.622 
—1.023 
2.623 
—1.014 
2.624 
—1.006 
2.625— 
—0.998 
2.626 
—0.990 


т.а 
Ты 


3.617 
—1.1954- 
3.631 
—1.182 
3.6454- 
—1.170 
3.659 
—1.158 
3.672 
—1.146 
3.684 
—1.135+ 
3.696 
—1.123 
3.708 
—1.118 
3.719 
—1.102 
3.730 
—1.092 
3.741 
—1.082 
3.751 
—1.072 
3.761 
—1.062 
3.770 
—1.058 
3.779 
—1.044 
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T.006 
T.996 


4.453 
—1.237 
4.479 
—1.223 
4.504 
—1.210 
4.528 
—1.197 
4.551 
—1.184 
4.574 
—1.171 
4.597 
-1.159 
4.619 
—1.148 
4.640 
—1.136 
4.601 
—1.1254- 
4.681 
-1.114 
4.701 
—1.108 
4.720 
—1.093 
4.739 
—1.083 
4.757 
—1.073 


THE PARTITION OF ERROR IN RANDOMIZED BLOCKS* 


О. KeuPTHORNE AND W. D. BARCLAY 
Iowa State College 


HE present note is concerned with procedures which are followed 
in the analysis of experiments. From casual examination of the 
data, itis thought that the experimental error is not homogeneous over 
the observations. It is therefore decided to partition the treatment sum 


of squares into components, and to partition the error sum of squares | 


correspondingly. Tests of significance are then made by comparing the 
component of the treatment sum of squares with its corresponding er- 
ror component by an F-test, or, if a partition of the treatments is made 


into individual degrees of freedom, by t-tests. See, for example, Cochran 3 


[2]. Before deciding to base the interpretation on the partitioning of the 
treatment and error sums of squares, it is also somewhat customary to 
make a test of the homogeneity of the components of the error sum of 
squares by means of Bartlett's test. See, for example, Snedecor [8] 
p. 413. 

The above procedures must be considered in relation to the re- 

‚ quirements for the analysis of variance. As stressed by various work- 

ers, for example, Cochran [3] and Bartlett [1], there are two distinct | 
problems which arise, namely, non-additivity and heterogeneity of er- 
ror. It is not in general easy to determine whether these problems ac- 
tually arise with a given experiment, though they will be often closely 
related in their occurrence. On the problem of non-additivity, there is 
available one test, namely Tukey’s test for non-additivity which is 
based on normal law theory [9]. On the problem of heterogeneity of er- 
Tor, there are available devices such as the plotting of range against 
mean of treatments, and the possibility of applying Bartlett’s test to a 
decomposition of the error sum of squares. ; 

The relative importance of the two problems depends on the point 
of view which the experimenter and statistician adopt. The first point 
of view is to regard the particular data which one obtains as a random 
sample from the conceptual population of data generated by imposing 
each treatment on each experimental unit, this sample having been. 
obtained by choosing an experimental plan at random from a class of 
possible plans. This will be termed the randomization approach and is 
described in detail, for example, in [6]. The second point of view is to 
regard the particular data which one obtains as having arisen as а ran- 
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dom sample from a population specified by a mathematical model with 
normally distributed errors. In the present note, the authors follow the 
first approach. 

The definition of additivity which will be used here is the following: 
that the effect of treatment k is to add a constant to the basic or con- 
trol yield of each plot. If one follows the randomization approach, it is 
fairly easy to see that additivity will result in homogeneity of error 
in the completely randomized experiment, in the sense that variance 
between plots treated alike will be independent of treatment. In the 
case of the ordinary randomized block experiment, it may be deduced 
that with additivity the error variance will be constant for all nor- 
malized treatment comparisons |6). It appears, therefore, that additiv- 
ity of treatment effects is much more important than homogeneity of 
error, and this is intuitively reasonable since without additivity the 
meaning of estimates of treatment effects and differences is obscure. 
Since non-additivity will generally produce heterogeneity of error [5], 
a significant result from the application of Bartlett's test could possibly 
be used as an indication of non-additivity in the data and an appropri- 
ate measure, such as a transformation of the data, could then be used. 
It will presumably be possible to have nen-additive effects which give 
the same variance for each treatment comparison on the average. How- 
ever, the usual procedure of analysis on the observed scale will be rea- 
sonably efficient in such a situation. 

When the error is heterogeneous, the usual procedure is to make a 
transformation which makes the error as homogeneous as possible. Ad- 
ditivity on the new scale is then assumed. A test of homogeneity of 
variance is therefore desirable, and it is appropriate to consider how 
the distribution of Bartlett’s criterion is affected by non-normality of 
the parent distributiorfs. It has been shown by Fisher [5], Welch [10], 
and Pitman [7], for example, that conventional t-tests and F-tests by 
and large are satisfactory from the randomization point of mem. The 
randomization point of view consists of examining the distribution of 
the test criterion over the possible sets of data which could arise in the 
population of possible randomizations. The over-all tests for treatments 
has been shown to mirror satisfactorily the corresponding randomiza- 
tion test. It was therefore decided to examine the extent to which some 
other tests reflected the corresponding randomization tests. The ign 
considered were the testing of 8 component of the treatment sum 0 


Squares against the error sum of squares and Bartlett’s test of homo- 


geneity of variances applied to components of the error sum of squares, 


specified by orthogonal comparisons of the treatments. 
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Since а mathematical investigation of these matters is difficult or 
perhaps impossible, an empirical study was made. Using the primary 
data of Eden and Yates [4] which consists of 8 blocks of 4 plots, a sample 
of 1,000 of the possible experiment plans for comparing 4 treatments 
in randomized blocks was drawn and examined. The treatment sum of 
squares was partitioned into 3 orthogonal single degree of freedom 
squares and the error into three corresponding parts each with seven 
degrees of freedom. It was found that the F-tests of components of the 
treatment sum of squares were satisfactory, but Bartlett’s criterion ap- 
plied to components of the error sum of Squares was distributed in а 
markedly different way from expectation under large sample normal 
theory. The sample distribution is summarized in Table 1. 


TABLE 1 


DISTRIBUTION OF BARTLETT’S HOMOGENEITY OF ERROR 
CRITERION OBTAINED FROM 1000 RANDOM PLANS COM- 
PARED WITH APPROPRIATE x? DISTRIBUTION 


hou --—— .VVVVV'———" 


X. Expected 
бады Lower Observed (normal (0-Е)/Е 
limit " theory) 
 _ м 

0 
9.210 30 10 40.000 

.01 
7.824 25 10 22.500 

.02 
5.991 78 30 76.800 

.05 
4.605 103 50 56.180 

.10 
3.219 176 100 57.760 

+20 
2.408 117 100 2.890 

.30 
1.386 206 200 0.180 

.50 
713 129 200 25.205 

.70 
446 57 100 18.490 

.80 
.211 42 100 33.640 

.90 
0 37 100 39.690 

1.00 


х:=373.335 Р<.01. 


PARTITION OF ERROR IN RANDOMIZED BLOCKS 613 


It was found that the verdict of heterogeneity of error based on Bart- 
lett’s test at the 5 per cent level, would be reached in 13.3 per cent of the 
samples. There is therefore a marked tendency to conclude that there 
is heterogeneity of error when in fact each of a complete set of normal- 
ized orthogonal comparisons is subject to the same error variance. It 
is possible, though unlikely, that the present numerical example is 
peculiar, but it seems more reasonable to conclude that it is indicative 


"of the general situation. Insofar as this is the case we may conclude 


that the application of Bartlett’s test to the testing of the homogeneity 
of error in a randomized experiment is unreliable and should not be used 
аз а general procedure. 

The consequences of concluding that there is heterogeneity of error 
when in fact this does not exist are two fold: (1) a loss in sensitivity 
of the experiment and (2) an underestimation of the accuracy of some 
comparisons with an overestimation of the accuracy of other compari- 
sons. The situation is not improved by the experimenter making a sub- 
division of the error sum of squares which is based on the observed re- 
sults. This procedure has two effects: (1) of giving an impression of 
higher sensitivity on some treatment comparisons, and (2) pronounced 
biases in the estimation of errors of the treatment comparisons. 

The present note brings to the fore two problems which need solu- 
tion. The first problem is the extent to which Bartlett’s test, which is 
based on asymptotic theory, can be applied to samples of the sizes nor- 
mally encountered. The second problem is the behavior of Tukey’s test 
under randomization: if this is not satisfactory, the test procedure is 
not reliable for randomized experiments in which plot errors are large. 
For in this case the experimenter is definitely picking an experimental 
plan at random from the appropriate class of plans and should be sub- 
ject to probabilities of*error of the magnitude he chooses by picking a 
significance level. The problem of testing for additivity is crucial in the 
analysis of experiments. It is not reasonable to regard deviations from 
additivity as being additional sources of error variance, particularly 
when these arise as block treatment interactions. 
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Credibility Procedures Are Required to Estimate Parameter Values forIndividuals of Heterogeneous 

Populations. A. L. Вапвү, Lumbermen's Mutual Casualty Company. 

Problems of estimation for which the olassical Bayes' approach would require а knowledge of the 
functional form of the a priori distribution are discussed. There are many instances in other fields, simi- 
lar to those found in casualty insurance, where there are reliable data available as to the mean and vari- 
ance of the a priori distribution even though the true functional form of that distribution is not known. 
Such data should be used in conjunction with an assumption that the functional form of the a priori 
distribution is the simplest form having the desired mean, variance and range. Specifically, the Beta, 
Gamma and Normal distributions should be assumed when the ranges are 0 to 1, 0 to + and — « to 
+% respectively. 

Casualty insurance ratemaking has benefited greatly from the use of such procedures which have 
been applied for many years on a rule of thumb basis. The mathematical justification for such pro- 
cedures has recently been developed by the author. In effect, they bridge the gap between the two ex- 
tremes otherwise available to statisticians: one in which they assume that all previously available data 
or data as to other sub-populations is immaterial, and the other in which they assume either that con- 
ditions have not changed from the past or that all sub-populations are homogeneous. 

In many cases the use of available knowledge of the mean and variance of an a priori distribution 
will produce a substantial reductiom in the error variance of the estimates being made. 


Problems of Data Collection Under Federal Sponsorship. Ѕтернех B. Уйттнвү, University of Michigan, 


The purpose behind government data collection is most frequently information for its own sake 
secondly for policy decision or direction, and less frequently basic research having long term goals 
A major problem to the academic scientist is the low priority given to the last. In addition, the social 
scientist receives a rather small fraction of the total governmental research budget. Six problem areas 
have general applicability. They are: the general climate in which the contracting governmental agency 
operates; the general climate within the agency (or program) regarding research and data collection; 
the nature and content of the problem of research; timing, pressure and deadlines; acceptance and un- 
derstanding of the findings; personality, interests and attitudes of the individual sponsors. The broad 
similarities and few differences in research for government versus industry are summarily treated. 


The Review and Coordination of Data Collecting Activities Sponsored by the Federal Government. 

Harry ALPERT, Bureau of the Budget. 

Federal sponsorship of social science and statistical research by non-governmental agencies received 
its greatest impetus in the period immediately following World War II as part of the growing over-all 
program of governmental support of scientific research and development. It is estimated that the Fed- 
eral Government spends annually approximately $2.25 millions for contract and grant research involv- 
ing data collecting activities. 

The Office of Statistical Standards of the Bureau of the Budget, as the central coordinating agency 
of the Federal statistical system, believes that data collections included in contracts and grants spon- 
sored by Federal agencies must be coordinated with other parts of the Federal statistical system in 
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The survey planning process must of necessity include provision for review and coordination. The 
need for basic statistical standards is indicated by several “horror” examples drawn from actual expe- 
rience with contractors for Federal agencies, 

The major arguments in opposition to Budget Bureau review in the area of contract and grant 
research are discussed. Special consideration is given to the problem of the role of the principal inves- 


includes the freedom not to accept or to seek Federal funds for contract or grant research. It does not in- 
clude the right to do sloppy work—with someone else's money. 


the finest body of solid, meaningful facts we are capable of produ, * 


Problems of Data Collection Under. Federal Sponsorship by Private Service Agencies. AnNorp J. Kina 
AND Naour D, Rorawett, National Analysts, Inc. 


Am objectivity, and induces the flow and interchange of ideas, The disadvantages are that data 
lec! од ів separated from tho decision maker and tho survey be diverted from testing hypoth 
that are needed to guide decisions leading to action, M T NE RE 

The barriers preventing the use of private research firms by the Federal government cited are: 


edge of the contribution which commercial research firms can make to government programs, along 
with some prejudices against the very word “commercial,” and (3) inability of federal agencies to dis- 
tinguish between a less reputable, ill-equipped firm and the reputable firm adequately equipped to col- 
lect the data needed, Tesulting in the tendency to buy research on the basis of price only, which can only 
lead to lower standards, 

The solutions to the problems are: (1) strengthening the Bureau of tho Budget's responsibility in 
contracting out sample surveys and (2) the personnel of this Bureau being better йде as to the 
technical staff, facilities, standards, and ethics of the commercial firms. It is also suggested that the 


9f rail carload waybills currently being secured by the Interstate C. missi: 
в 'ommeree Co; . The sample 
3e selected by the carriera to include all revenue carload waybills numbered “17 or Wil дашын enc 


ment, type of rate, and length of haul, This produces about 30,000 i ich ii 
t x ў ,000 traffic ca! ich of which is 
relatively homogeneous with respect to the rate characteristics of the Пәле танық К comparison of 


provid te edere dt fot comparable agre, а нао А baron of 
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These computed indexes are subject to sampling error and several techniques for estimating their 
standard deviations are described. The first is based upon the separation of the sample bills into several 
subsamples, development of traffic categories and from them indexes for each subsample. An estimate of 
the standard deviation for each index can be made from. the variations observed in the subsample in- 
dexes. A second and much less costly, but also less accurate, method estimates the standard deviation 
from variations observed in indexes computed from several subsamples of the initially used traffic cate- 
gories. These techniques are also applicable to other complex transportation problems. 


Forecasting Fruit Production in California, G. M, Kuznets, University of California (Berkeley) AND 

Gzonoz Hanvzr, Bureau of Agricultural Economics (Sacramento). 

Annual production forecasts of substantial accuracy are required for California fruit crops the dis- 
position of which is regulated by state or federal marketing agreements, Negative experience with fore- 
casts based on grower or field men crop ratings has given impetus to development of objective pro- 
cedures. The paper deals largely with problems encountered in evolving an efficient forecasting proce- 
dure based on measurements of physical characteristics of в. maturing crop. For tree fruit crops, such as 
peaches or pears, the characteristics taken into account are number of fruit on tree and fruit size (di- 
ameter) at forecast date, The forecast may take the form of a ratio estimate relating some function 
of fruit counts and size measurements in two seasons ога regression procedure which utilizes the rela- 
tion, previously established, between harvest weight of fruit (per tree) and early season fruit counts and. 
size measurements, Data collected in 1952 surveys of clingstone peach and Bartlett pear production 
areas in California provided tentative indications of sample size required for specified accuracy and 
made it possible to explore such questions as efficiency of partial (single branch or scaffold) fruit counts, 
accuracy of on-tree fruit counts not requiring destruction of immature fruit, optimum allocation of 
sample blocks and sample trees within blocks, all of which have an obvious bearing on accuracy and cost 
of objective procedures. 


Experimental Designs and Probability Sampling in Marketing Research. Max E. Bronx AND WALTER 
Т. Рерквев, Cornell University. 
‘This paper is published in full elsewhere іп this issue. — * 


‘The Problem of Autocorrelation in Regression Analysis. R. L. ANDERSON, North Carolina State College. 
Much research has been devoted to the distributions of various statistics used to test the existence 
of autocorrelation of successive observations. Others have studied the problem of estimating parameters 
in various stochastic processes, such as autoregressive and moving average processes. A summary of this 
research is given in this paper. 
Only recently has research been extended to the problem of testing for the existence of autocor- 
related errors in regression models, such as 


r 
у,=в,+®72 Хи Не  $-h27tt 
іі 


where the X's are fixed predictors and the ев are normally distributed with equal variance. Durbin and 
"Watson (1950, 1951) present upper and lower bounds on the significance levels for making such teste. 
Moran (1950) presents an exact test for г =1. 

Too little information is available on the proper methods of estimating the 78 when the ев are auto- 
correlated. Aitken (1935) indicated the exact method of transforming the regression variables when the 
autocorrelations were known. Champernowne (1948) added to this general theory and presented a 


Cochran and Orcutt (1949) used empirical sampling methods to indicate the effects of autocorre- 
lated errors on the estimates of error and the 6's. They showed that, in many cases, first differences of the 
Y’s and X's would have а relatively uncorrelated error process. с 

Watson (1951) has shown the seriousness of using the wrong type of error process and incorrect 
estimates of the autocorrelations in transforming the regression variables. He concludes that the most 
fruitful research seems to be in utilizing more efficiently the estimates of the autocorrelations. 


Use of Observations Taken Periodically in Growth Studies. H. L. Lucas, North Carolina State College. 


body weights as is customary in feeding trials but also various numbers of the intermediate weights 
which are taken routinely at regular intervals. Following the approach of some previous authors, poly- 
опаа were fitted to the data for each animal, and the coefficients of the polynomials were analyzed 
to teat diotary effects. This was done for both the weights and the logarithms of the weights. Ав judged 
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by the F ratios (between-diet mean square/within-diet mean square) obtained for the polynomial co- 


tions, as also might the use of models Which more accurately describe the growth curve than do poly- 
nomials, 


The Analysis of Biological Time Series, Автнов М. отток, University of Rochester, 

The methods of classical economic time series analysis are aimed primarily at the problem of test- 
ing hypothesis about—or estimating the parameters in—a fundamental stochastic model which is as- 
sumed to underlie the successive time-ordered observations in а series, The oscillatory properties of 
such a series, caused by autocorrelation of the errors, is Particularly of interest with respect to predicting 
future values of the series. 


Use of the Census Current Population Suryey to Obtain Information on Morbidity, Тнкорояк D. Woor- 
BEY, U. S. Public Health Service. Е 


The three supplements have. Provided data of apparently high quality, though of limited scope. In 

the second project mentioned above the value of the results was enhanced by a successful follow-up of 

lix more ту disabled cases 37 months after the first of the monthly surveys and 18 months after 
е second, 

‚ Experience with collection of morbidity data by this means hag indicated that the method is con- 
venient, speedy, and inexpensive, Estimates derived are applicable to the civilian population of the 
country as a whole, exclusive of the inmates of resident institutions, 
mates unique in the field of morbidity. Knowled, 


Household Survey on Health Conditions and Medical Car 
AND JEROME CORNFIELD. cal Care in New York City, Neva R. DEARDORFF 


; 
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& longitudinal analysis of the insured population over a four year period. Thus it is possible in the case 
of the insured population to compare what respondents said to interviewers with the reporta of the 
medical groups currently serving these respondents on questions relating to diagnoses and medical 
services. 

Five thousand households were chosen for the sample of insured persons and the same number for 
the sample of the general population. The schedule required an interview averaging fifty-five minutes. 
"The Project contracted with Alfred Politz Research, Inc. for the interviewer services. The results of the 
field operations were summarized. 


What Makes a Quality Control Chart Tick. Leo A. Agoran, Hughes Research and Development Labo- 
ratories. 

The general theory of the effectiveness of a quality control chart used alone or with another chart 
was developed recently by Н. Levene and L, A. Aroian (Journal of the American Statistical Association 
45, Dec. 1950, 520-29). The present paper applies the theory to the case of a single quality control 
chart for attributes (the p chart), under a single simple alternative, an increasing trend alternative, a 
rather chaotic alternative, and an erratic periodic alternative. Tables and charts illustrate the theory. 
The results shed light on the proper design of p charts, the choice of the upper and lower control limits, 
and the sample size. The paper will appear in a future issue of Industrial Quality Control. 


Forecasts of the War Production Authorities. Ковіхвом Мехусомве, Investors Diversified Services. 

The first task in projecting defense expenditures is that of arriving at a judgment as to what pro- 
grams will survive the conflict of forces over a period of two to five years at least and what size the pro- 
grams will be after there has been a resolution, temporary or permanent, of the conflicts. Relatively lit- 
tle attention can be paid to current views; more attention must be given to what the views are likely to 
be two to five years hence. This is, of course, a problem in sociology and politics as wellas in economica 

Once general conclusions have been reached as to the type and magnitude of programs which will 
be supported two to five years hence, attention must be directed to the technical and economic problems 
involved. Detailed studies must be made program by program of the rate at which production and 
deliveries must rise in order to achieve the goals assumed Фо survive. These rates must be compared 
with feasibility data, again with an eye towards political and economic pressures. The military in many 
instances have set production schedules far above feasibility. Realistic figures must be substituted for 
the military figures in such cases. Finally, judgments must be found both in principle and for specific 
programs as to whether a final state of readiness will rest primarily on stockpiles or will emphasize more 
moderate stockpiles plus standby plants. 

A review of the defense expenditures forecast by the ODM shows that they were far below those 
generally accepted by economists. Nevertheless, the first forecast made in April 1951 was about 5% too 
high for the fourth quarter of 1952. The December 1952 forecast of $54 billion as the possible peak in '53 
may also be somewhat high. 

The combination of anticipated defense needs plus defense demands themselves reached a peak in 
the first quarter of 1951 Business investment, inventory accumulation, and security expenditures have 
represented a declining proportion of economic activity since that time, and in general have created 
less and less pressure on prices. The demand for security expenditures and other investment demands 
in 1953 will be much easier to support than they were in 1951 and 1952. 


Elements of a Coordinated System of Vital Records and Statistics. Нлізкит Г. Dunn, U. S. Public 

Health Service. 

Local, state, federal, and international units are all active in the vital statistics field, either in 
collecting and preserving vital records, in performing essential services to the public in relation to these 
records, or in producing the statistical by-products. The most important problem facing these diverse 
mechanisms is how to function as a coordintaed whole. 

With each unit doing its share of the job as it independently conceives it, there is much working 
at cross purposes. No matter how laudable these independent goals, nor how close we come to achiev- 
ing them, what counts with the user of vital records and statistics is the total impact. The local health 
officer, the State registrar, the National Office of Vital Statistics, the statistical units of the inter- 
national organizations—none can afford to “do its job” without considering whether it might more 
appropriately be done by another unit and without full awareness of where “its job” begins and the 
others leave off. 

As а step toward a coordinated system of vital records and statistics, the author defines his personal 
viewpoints as to the objectives and essential elements of such a system, and discusses the respeotive re- 
Sponsibilities of the various levels of government. 
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Coordination of Population Estimates Used by Federal, State, and Local Agencies. Henny Бнвтоск, 

JR., Bureau of the Census, 

In the field of population estimates, the main problem is not one of a plethora of conflicting figures 
from different sources but rather of a lack of official estimates of any kind for most areas. Regularly 
published estimates of the Bureau of the Census cover mostly the United States and States, with some 
classification by age, sex, and color, The staff of the Population Estimates and Forecasts Unit devotes 
from one-quarter to one-third of its time to giving advice (by conference, telephone, or letter) on ad- 


ogy has been published, including methods for current population estimates of cities and counties. 
Source data must be obtained from other agencies, especially the National Office of Vital Statis- 
tics, the Immigration and Naturalization Service, the Department of Defense, and State Departments 
of Education. The nature and availability of these data are discussed from the standpoints of errors in 
the population estimates and lags in their publication, 
- The use of population estimates by federal, State, and local agencies is also examined, with em- 
phasis on the specific needs of particular agencies, Many unpublished figures are supplied routinely, or 


suggested by the Census Bureau. The department of health is usually the State agency responsible for 
current estimates. Greatest opportunities for cooperative efforts seem to exist in this area, A small but 
Promising step is represented by the annual Public Health Conference on Records and Statistics, which 
has included a work group on “Population Statistics” for several years. Here federal and State statis- 
ticians have been discussing data needs and methodological and procedural possibilities, 


The Analysis of Counts of the Extragalactic Nebulae in Terms of a Fluctuating Density Field, D, Ner- 
вом ТлмвЕн, The Yerkes Observatory, 


A method has been. developed for analyzing the counts of the extragalactic nebulae on the assump- 
tion that the number of nebulae per unit volume at a point r can be expressed ав: p(r) =A[1-+D(r)], 


Dr) Des) = &:T 1r. -n. 
The expressions for the serial correlations between the counts of the nebulae per unit solid angle 


Preliminary results from an analysis of Professor Shane's counts of the extragalactic nebulae in- 
dicate that the Proposed model is quite adequate for explaining the observed general features in a satis- 
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purpose of measurement the Bureau of Labor Statistics adopted a concept which is а cross between these 
two basic approaches, 

Obtaining summary measures of output in physical units of production is difficult, because such > 
‘units are usually non-additive. Varying physical units of output can be made additive by expressing 
them in terms of dollars or employment; however, it becomes necessary to deflate these data by indexes 
of price or labor productivity. 

The Bureau of Labor Statistics has used an employment-man-hour technique which permits the 
estimation of productive capacity. This procedure involves a ratio of potential maximum to current 
man-hours, Such “capacity ratios” were obtained for metal working industries. 

Friction items tend to make the operating level lower than the potential maximum. An approxi- 
mation of the friction can be made and an “index of expansibility” which allows for this friction is sug- 
gested as a more realistic measure. 


Census Tracts and Urban Research. Donap L, Еогкү, University of Rochester. 

There has been but limited use of census tract statistics in university-based social research. Urban 
sociologists have been the main consumers, in the research field generally designated as human ecology. 
Five patterns of research use in this field are identified. 

Certain ecological and statistical assumptions implicit in the research use of census tract material 
are examined: the “natural area” concept, the prospects for geners] theories of urban spatial patterning, 
the validity of tract data when used in a statistical index sense, the reliability of tract statistics when 
sampling is involved, limitations to be recognized in interpreting ecological correlations, and the static 
framework in which most tract statistics have been cast. 

It is reoommended that in future social science research, (1) more social scientists (especially non- 
sociologists) be encouraged to use tract statistics, (2) tract data may be most effectively used in the 
spirit of providing rough ecological profiles, (3) the use of tract statistics be integrated with other re- 
search approaches, (4) ecological correlations be used only when relating areal characteristics (and are 
not a substitute for individual correlations), (5) ingenuity is needed in introducing new types and forms 
of tract information, and (6) within each large city there is continuing need for key researchers to foster 
cooperative use of the census tract reporting system. e 


Current Developments and Problems in Connection with the Census Tract Program, Roy V. PEEL, 

Bureau of the Census, 

The Census Bureau participated in the Census tract program primarily through establishing 
standards and through making tabulations, However, we have now come to the conclusion that a 
system of small areas with fixed and well-defined boundaries should be established through the exten- 
sion of the census tract program. These would assist in establishing stable administrative units for tak- 
ing censuses and would enable the publication of data for such stable areas as local needs justify and ap- 
propriations permit. A related development is the establishment for the 1950 Census of census county 
divisions as relatively permanent units for presenting statistics for the State of Washington. Discussions 
аге going on in а number of other States where minor civil divisions do not liave stable boundaries to 
explore the possibility of delimiting similar areas. During the last decade, marketing groups took the 
initiative in acquiring and presenting retail trade data and data from other sources by tracta or groups 
of tracts and demonstrated the utility of tracts for marketing uses. As a result, the Bureau of the Census 
is collaborating in the development of groups of census tracts into retail trade areas for presentation 
of data. 


What an Economist Wants to Know іп the Way of Saving Data. Севнлер Corm, National Planning 

Association. 

Balanced economic growth requires that demand for goods and services increase roughly in pro- 
portion with productive capacity. The economist concerned with economic growth is interested in saving 
ав one of the factors limiting demand. Since future consumption is customarily projected by deducting 
an estimate of saving from disposable income, we need to know what is the “normal” rate of saving which 
should be assumed from a given disposable income. 

Some studies seem to indicate that the rate of saving, in the long run, fluctuates around a rather 
stable trend line. Others, particularly family budget studies, reveal a persistent tendency for saying 
to rise with rising incomes. Although we now have & wealth of saving statistics, they do not permit 
а conclusive answer to the specific and vitally important question: Is the recent rise in the saving ratio 
due mainly to extraordinary factors and hence to be discounted in projections of the future saving ratio, 
or does it largely reflect the fact that incomes have gone up and hence indicate a still higher saving ratio 
in the future if incomes continue to rise. 

Questions of this type may require the collection of additional statistics but they particularly re- 
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The Next Steps in the Statistical Study of Saving. Клумомо W. Gorpsurrg. 

The paper briefly discusses the following twelve steps: full integration of estimates of saving into а. 
system of social accounts, correlation of estimates from balance sheet and income account, and from 
aggregate and sample data, more detailed explanation of sources and methods of estimates; provision 
of variant estimates, such as cash saving and saving following the business accounting rather than 


saver groups; finer breskdowns by forms of saving; less “пе(певв” in estimates; appraisal of margin of 
errors in estimates; appraisal of motivational significance of estimates; tie-in of data on current saving; 
cumulated (life) saving; and wealth; tie-in of quarterly or monthly with annual data, 


The Use of Laboratory Experiments in Teaching Probability Statistics, A. C. RosANDzn, George Wash- 
ington University, 


Furthermore the traditional materials of demonstrating probability—coins, dice, cards and wheels, 
—need to be supplemented by new devices and apparatus in order to demonstrate important new 
principles and techniques in sampling, estimation, experimentation, and inference. Four such new 
devices are described in detail, devices which are designed to demonstrate the principles of subsampling, 
group sampling units, problems of sampling design, and the analysis of variance, 

In order to obtain the maximum return from the inductive approach to probability statistics, a 
formal set of 45 experiments are listed and the materiale requite to perform them are itemized. These 
experiments, many of which have Already been tested in the classroom or on the job, are organized to 
Parallel а systematic development of the subject, 


the student but the teacher in charge of such a laboratory course, to stress the understanding of basic 
principles, to show how to apply these principles to real problems, to show how to record and to process 


date with these experiments, are described briefly, тей i i bability 
Medie e S fly. Several references to experiments in prol 
Multiple Comparisons, Jonn Tuxer, Princeton University. 


The most useful type of statistically-based statement is of the form “so-and-so is equal to such-and- 
such within thus-and-such 8 margin of error.” All statisticians realize that any such statement, no 


ы 
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matter how wide the margin, is liable to error. In the best-regulated situations we can control therate 
at which errors are made (as in simple confidence limit situations). 

If we have a number of determinations (measurements, 24, of certain long-run values, or determin- 
ands, 44) which we wish to treat as having a common variance g?(of which we have an unbiased estimate, 
41, whose stability corresponds to a given number of degrees of freedom, DDF) then we are faced with 
a problem of parallel determinations or multiple comparisons, or both. Upon investigation, a number of 
different and precisely definable types of error rate arise, including: per determination, per batch, 
batchwise, per comparison, per family and familywise. An error rate of p% familywise, for example, 
means that іп (100—p)% of all families analyzed, all the comparative statements among the various 
determinations are within their indicated margin of error. 

The numerical details of setting such margins for error, termed allowances, for an error rate of 5% 
familywise is discussed and the necessary tables given. It is computationally convenient to calculate 
first the relatively familiar least significant difference or LSD, which equals в\/2 times the 5% point of 
Student's ІҢ - It is then possible to find the allowance appropriate for simple comparisons of the form 
25 —zj to an adequate approximation as 


WSD (2. Hind ) LSD) 
pprJ * 


where A» and В» depend only on the number, m, of determinations in the family. (Tables for 15т520 
аге given.) 

The allowance for any linear combination Хан is the norm of {cs} times the WSD, where this 
norm is the sum of the positive cj, or minus the sum of the negative сұ, whichever is larger. 

Numerical examples are discussed, and generalizations and extensions are mentioned. 


Experimental Designs for the Physical Sciences. W. J. Үосрем, National Bureau of Standards. 

In experimental design a block consists of a number of “treatments,” “varieties” or items grouped 
together in the experimental program. Intrablock compari: аге more precise than comparisons in- 
volving two blocks. For many physical experiments the block size is sharply fixed and frequently ac- 
commodates only two or threeitems. The high precision of physical measurements makesit unnecessary 
to use many replications. These conditions favor the use of partially balanced designs. Examples given 
lustrate the use of partially balanced designs for blocks of two. \ 


Some Potential Contributions of Mathematics to Social and Economic Statistics. Ғаковніск F 

Этернлм, Princeton University. 

American statisticians have attempted to meet a growing demand for statistical information for a 
century or more, They have not centralized the collection of economic and social statistics but they 
have sought improvements in accuracy and dependability whenever such data are assembled for general 
use. The accumulation of experience and know-how has led not only to higher quality and wider use but 
toa still greater demand for accuracy. This need for accuracy and for more statistical material cannot be 
met in the future merely by experi@nce; some new methods based on mathematica and certain branches 
of applied science are required. 

Mathematics can contribute & precise and powerful language, an instrument of analysis, a vehicle 
for importing useful results from other sciences, and a basis for a systematic theory of the production 
and use of statistical information. Examples of its contributions can be found in counting, classification, 
calibration, measurement, time series, and various other aspects of statistics. R. W. Burgess’ advice to 
the statistical forecaster can be extended by adding practical admonitions to the statistician who tries 
to put mathematics to work on these problems, 


The New Electronic Machines and the Future of Statistics. Сотнвект С. Hurp, International Business 

Machines Corporation. 

The contributions of statisticians to the development of automatic data processing machines are 
discussed. These include large problems such as furnished by the U. S. Census, as well as the training 
of personnel and the writing of procedures for data processing installations. Problems to be solved are 
divided into two general categories, one having little data and a large amount of processing, the other 
having a great deal of data and в relatively small amount of processing. Electronic developments for 
compact, rapid ‘access storage and high speed computing circuits and their relation to stored program 
operation are discussed. The application of new machines to statistical problems is described in con- 
nection with problems of recording, editing, and error detecting. 
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Some Statistical Problems in Field Geology. Howanp 7. Ричсов, Ohio State University. 

Many geological studies include data describing orientations in 2 or 3 dimensions. Analyses are 
typically directed toward drawing inferences regarding direction and intensity of factors such as de- 
forming forces and depositional agente. 3-dimensional field studies most commonly use strike (the direc- 
tion with respect to true north of the line of intersection between the given plane surface and the local 
horizontal), and the dip (the given plane’s inclination, measured from the horizontal in а vertical plane 
normal to the strike). The range of strike and of dip is 180°, 

Problems of sampling and measurement are often complicated by the high order of variability of 
materials and the paucity of suitable data. 

In studies of rock fractures, shear sets produce bimodal distributions with interdependent con- 
centrations. Problems such as determining sample size, establishing sampling schemes, and estimating 
mean dihedral angles between pairs of planes must await the application of adequate distributions of 
periodic variates. 

Graphic methods have been used for evaluating modes and for simple analyses of both 2- and 
3-dimensional data. 

Using ав models *uniform" or Poisson distributions (as plotted on polar, rectangular, and spherical 
systems), observed data have been compared to the models with chi-square and other tests, 

Circular normal theory appears to be applicable to some of the orientation problems encountered. 
Applications are to be presented in detail in the literature. Use of distribution functions provides con- 
siderably more information than merely establishing “significant” departure from arbitrary standards 
of uniformity. 


Applications of Statistical Methods to Sedimentary Rocks. W. C. KnuwsEmN, Northwestern University, 

Btatistical methods find wide application in geology; especially in the study of textures, structures, 
and composition of sedimentary rocks, Certain apparent irregularities in the data, such as highly 
skewed distributions, use of weight instead of number frequencies, use of unequal clas intervals, and 


sition data are commonly binomial or Poisson distributions. 

The present paper sketches the development of statistical thinking in sedimentation and includes 
а discussion of some geological problems that can be attacked statistically. The discussion is extended to 
include problems of sampling, relations between sample and population, questions of areal variation in 
sediment properties, design of experiments, and other aspects in need of continuing statistical analysis, 


Climatology's Needs in Statistical Research. Анмого Court, 


Climatology ав a. separate science began early in the 19th century, and its development has roughly 
paralleled that of statistics, Many statistical techniques were applied to climatio problems as soon ав 


niques, such as the polar dingram (L. von Buch, 1818) and isopleth diagram (L. Lallanne, 1846) origi- 
nated in climatic representation, However, despite occasional flurries of interest (H. Meyer 1891; C. F. 
Marvin et al. 1915-22), use of statistical methods in climatology has not kept pace with that in other 


and thus concentrate on the central portions of the frequency distributions of the various elements; 
engineering studies assess the probable frequencies of desirable or harmful Occurrences, usually the 
extremes of one or two elements, separately or in combination. 


such averages. Finally, more attention must be given to the i i 
i ^ t problem of analyzing dependent data in 
which dependence decreases Tapidly (in time or space), as do most data of climatology. 


1 
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Estimates of Labor Force, Employment, and Unemployment, 1900-1950. Әтлміжү Lesercorr, Bureau 
of the Budget. 

Estimates of labor force, employment and unemployment were presented for the years 1900-1950. 
Totals for each of these series are intended to be comparable with the current monthly estimates of the 
Census Bureau. Estimates of employment by major industry group, and for a variety of minor groups. 
are comparable with the current series of the Bureau of Labor Statistics. 

These series differ from earlier estimates for a variety of reasons. They are comparable with the 
current official series; they draw upon data which have become available only in recent years; they rest 
on an evaluation of statistical relationships in the entire half century; and they give more explicit al- 
lowance for the possible impacts of cyclical changes and wartime production on employment than some 
earlier estimates could. 


Statistical Problems Encountered in the Work of the Commission on Financing of Hospital Care. 


Івіронв AurMAN AND Ковевт М. SIGMOND. 

The Commission on Financing of Hospital Care, created to study “the costs of providing adequate 
hospital services and the determination of the best systems of payment for such services,” is carrying 
on а number of studies in the fields of hospital fiscal problems, physician-hospital relationship, prepay- 
ment for hospital care, and financing of hospital care for low-income groups. All the studies have sta- 
tistical aspects. Three of these—coverage of the population by hospital insurance, the characteristics 
of hospitals with high and low operating costs, and the attitude of the American people toward hospital 
insurance techniques are involved. Problems concerning choice of procedure, construction of question- 
naires, most appropriate sources of information, evaluation of available data, most fruitful investment 
of time and energies, etc., are discussed. The role of the statistician in the setting of a study commission 
is also discussed: to supervise statistical activities of the staff, to educate his co-workers to the careful 
"use of statistical data, and to serve as “philosopher and statesman.” 


Some Problems in Determining Maxima of Functions of Аа Variables. FREDERICK MOBTELLER. 

Harvard University. 

"This paper discusses problems in determining extremals (maxima or minima) of functions of several 
continuous variables. The principal innovation is to propose а way of measuring the "togetherness" of 
the large values of the function, This measure provides some idea of whether a function will be amenable 
to sequential techniques like steepest ascent, whether random drawing will do nearly as well, or whether 
some change in the coordinate system would be desirable. 

Some of the advantages of random drawing of points are discussed, and the results of random. 
drawings are compared with sequential techniques; a modification of the random technique that makes 
it à sequential technique is described. Effects of errors of measurement are discussed briefly. 


Recent Advances in Finding Best Operating Conditions. R. L. Anperson, North Carolina State College, 


"This paper discusses various experimental procedures used to estimate the optimum point on а re- 
вропве surface, 


y = Aln Л, +, fy 


where y is the response and /; the amount of theith factor used in producing у. Multi-factor experiments 
were first set up to investigate one factor at a time; then Fisher (1935) and Yates (1935, 1937) introduced 
the complete factorials for field experiments, plus confounded arrangements for incomplete blocks de- 
signs. More recently, fractional replication designs have been introduced in order to cut down the size 
of the experiment; see for example, Kempthorne (1952). 

Hotelling (1941) derived methods of locating the optimal point using а single factor. Friedman and 
Savage (1947) outline a sequential one-factor-at-a-time experimental plan when several factors are in- 
volved. 2 

Box and Wilson (1951) present a method to determine the vicinity of the optimum by use of the 
“path of steepest ascent.” They determine this path from preliminary experiments, assumed far enough 
removed from the optimum so that the response is essentially planar. When one approaches the opti- 
mum, new experiments are used to estimate quadratic and interaction effects. On the basis of a series 
of experiments using the Box-Wilson methods, we concluded: (i) The experimental error must be small. 
(ii) The experimenter must know enough about the response surface so that the nature of the reaction 
does not change as he proceeds from the starting point to the optimum. (iii) He must be able to start 
with factor levels spaced far enough apart to indicate linear effects if they do exist, and still not have 
important interaction and quadratic effects. 
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City and Area Statistics—Chamber of Commerce Experience. LEONARD А. Dnaxz, Chamber of Com- 
merce of Greater Philadelphia. 

There is a great volume of local statistics available to agencies willing to spend the time and funds, 
So much is available to me in Philadelphia that, with a limited staff we must forego the application of 
advanced statistical procedures to the raw data. 

There is very seldom need for anything beyond say a moving average for smoothing purposes, 
elementary sampling techniques, and an occasional seasonal index. Of utmost importance is common 
sense plus experience in handling the wealth of local data. For example, What data is important and what 
does one discard; which statistics are of doubtful accuracy and to what degree; what are the proper 
methods of tabular presentation; how may the data best be illustrated, by diagram or map? 

One of the most interesting fields of our statistical work is the projection of population trends— 
total, Negro, school, by age groups, by city subdivisions, and by counties. I have seen some highly re- 
fined statistical measures applied to Population projection at both Philadelphia and national levels go 
haywire because of wrong basic assumptions. 

It is much more important to make informed "guesses" relative, for example, to the impact of 
earlier marriages and high birth rates in an era of full employment, or how much impact the area's 
new steel industry will have, than to work out rigid mathematical curves projecting historical data 
twenty or thirty years ahead. 

The use of local, business, and civic statistics, whether in forecasting, projection, or straightaway 
analysis, requires a maximum of common sense, experience, and imagination and a minimum application 
of text book tools of correlation and other refined statistical manipulations, 

There is still great need for educating the business community on the value of statistics and sta- 
tistical, economic, and market research. My method is to make all statistical reports as graphic as 


Notes on the Revision of the Consumer Price Index. Epwanp D. Howanver, Bureau of Labor Sta- 
tistics, 
The Bureau of Labor Statistics has completed the first revision of its Consumer Price Index in 15 
The revised index is essentially unchanged in purpose, design and in most aspects of measure- 
ment. It is designed primarily as a price deflator of wage income. The Consumer Price Indexes are 
designed as Laspeyres indexes, but in practice, fixed-weighted indexes cannot very long be maintained. 
‘There is theoretical objection to an index which assumes complete inelasticity of demand through the 
Tanges of prices and income situations, 
From a purely theoretical point of view, the Purposes of a deflator over time are served by an index 
of the cumulative effect of price changes on the purchasing power of income in two situations, in which 


Population characteristics and the standard of living are part of the weights, Changes in di. 

ш * expendi- 

tures arising from changes in these characteristics are treated in ways that do not SARE the level of the 
index. Economie logic of the index formulation requires that weights change with the manner and level 


of data and calculation of indexes as efficient, both statistically and administratively, as possible, with a 
minimal sacrifice of precision, The index includes 46 cities, stratified by size, climate, income and density, 


э; 


SUMMARIES OF PAPERS AT THE 112ТН ANNUAL MEETING 629 


index. The variability among “outlets” (retail stores and service establishments) appears to be some- 
what less than the variability among items. 

A distinction is made in the revised index between the monthly and annual indexes. The annual 
indexes, calculated with seasonally varying quantity weights and incorporating certain benchmark 
corrections, will be the most precise measures of year-to-year price movement. To the extent that the 
indexes can approximate the price effect of continuous price-quantity changes, they should provide 
suitable deflators of wage and salary income, and better measures than we have had previously of the 
long-term trends in real incomes and standards of living. 


Reliability of Soviet Industrial and National Income Statistics. ALEXANDER GERSCHENKRON, Harvard 

University. 

The deficiencies of Soviet economic statistics stem from a variety of sources: 1) The economic back- 
wardness of a country with but a brief tradition of mass literacy. 2) The institutional setting of the Five 
Year Plans and the character of the Soviet economy as a “deficit economy" which induce managers of 
industrial enterprises to falsify production reports either in order to give the impression of better results 
than those actually attained or in order to hide actually produced output for purposes of various illicit 
transactions. 3) Distortion of statistics by the central authorities in the interest of propaganda as a rule 
resulting in overstatements of the data on industrial output and national income in general. 4) The 
very small volume of statistical information published. 

These deficlencies severely limit the reliability of Soviet industrial and national income statistics. 
As а rule, it has been impossible to make any significant corrections for deficiencies listed under (1) and 
(2) above, On the other hand, western scholars have in the past suceeded in many instances in penetrat- 
ing the propaganda veil spread by the central authorities and in reaching some significant conclusions, 
This was possible because in general such figures as were given did not represent sheer inventions, but 
had some meaning and significance which made it possible for critical analysis to uncover the distortions 
and to attempt corrections. It is unknown whether the same opportunities for critical research will 
exist in the future. The temptations of the cold war may well induce the Soviet authorities to resort to 
publication of data which will be based on nothing but sheer inventions. The extreme paucity of present 
statistical information would facilitate such a course because it could be pursued without much fear of 
obvious inconsistencies and contradictions. 


Reliability and Usability of Soviet Statistics: A Summary Appraisal. Аввам Berason, Columbia Uni- 
versity. 

For Western studenta of the Soviet economy, an initial difficulty in the way of serious research во far 
ав Soviet statistical data are concerned arises from the Soviet government's policy of withholding in- 
formation, This policy is not new, but in the course of time the government has become progressively 
more secretive. For some years the government has been withholding statistical data not only on matters 
of immediate military concern, but also on the economy generally. 

As to the statistics that are published, their quality is affected adversely, though to а degree which 
is often conjectural, by a variety of features: falsification and inefficiency in reporting of raw data by 
lower echelons; deficiencies in the collection, processing and publication of data by the higher echelons, 
The effect of these limitations mogt often is to give an unduly favorable impression of the Soviet econ- 
omy, but there are reasons to think, nevertheless, that the higher echelons do not generally resort to 
falsification in the sense of free invention and double bookkeeping. Accordingly there is at least a core 
of fact in Soviet statistics and much of the research being pursued today by Western scholars rests ulti- 
mately on the supposition that this is во. The evidence for this supposition, however, is not now as im- 
pelling as it once was. Accordingly, this notion has to be constantly reviewed. , 


The Nature of Soviet Population and Vital Statistics. FRANK LORIMER. 

Questions concerning the reliability of Soviet statistics during the 1920's are purely technical 
Data on such items as age and mortality were seriously affected by inaccurate or incomplete reporting, 
but the 1926 census data were presented in complete detail, and rapid progress was made in the im- 
provement of vital statistics. 

After about 1930, progress in demographic statistics—in publication and also in technical reliability 
of information available to the government—was eclipsed by a spreading cloud of anxiety and secrecy. 
The first clear indication of an ominous trend in official information on the Soviet population was the 
suppression of regular detailed reports on vital statistics. There can be no doubt that this drastic action 
was motivated chiefly by anxiety to conceal the excess death of several million people during the forced 
collectivization of agriculture and associated disorders (as shown by intensive analysis of later official 
information). It is also true that civil registration was for a time seriously disordered by these calamities. 
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Estimates published in the Second Five-Year Plan show gross errors in the estimates of current 
Population, as well as in population Projections, Possible reasons for such error are discussed, It is 
probable that the discrepancy between the expected number and that indicated by returns from the 
1937 census was largely responsible for the suppression of the latter and the purge of officials in charge 
of its administration. 

The treatment of the 1939 census 
deemed “fit” for publication. The peculiar device of 


soon disrupted by war, 

New techniques involving quick enumerations, especially in cities, and registration procedures, 
especially in rural areas, were developed during the war and post-war period, Secret data at the disposal 
of the government obtained in this way is being gradually extended and controlled with respect to 
quality, so that such information may now approach tolerable accuracy, 

In conclusion, emphasis is placed on distinctions between the three rather distinct, though related, 
questions of “reportorial fidelity,* “fidelity to science," and “technical accuracy.” It is assumed that 
official technical releases have, up to the present time, generally respected the first of these principles, 
but since the early years of the regime, have ignored the second. The third of these problems requires 
greater emphasis than it has sometimes received in the use of partial information on the Soviet popula- 
tion by western scholars. , 


Agricultural Statistics in Soviet Russia: Their Usability and Reliability. Lazar Уолу. 

Agricultural statistics are important in Russia because of the importance of the harvest to the 
well-being of the Russian people, as well as to the economic program of both the Tsarist and Soviet 
Meng and, because of the crucial role played on the Russian socio-economic scene by the agrarian 

lem, 

Even before the Second World War Cgnificant agricultural information was not as abundant as in 
the 1920's, and, since the war, little has been published, The reliability of the figures is often inferior to 
their quantity, 

Unqualified acceptance of official crop yields and production statistics is opposed. These figures are 
Preharvest estimates of the standing crop, which do not take into account the large harvesting losses 
common in the USSR, and lend themselves to over-estimation. 


А Comparison of Two Different Methods of Caseworkers? Ju ents of Movement. CHARLES GERSH- 
ENSON, Jewish Children’s Bureau of Chicago. M ШКУ 


Pearsonian correlation coefficienta were computed for the 9-poi; i 
‘Point scale and rank correlation coeffi- 
ond m the ordered data, For the group of ten workers the mean reliability of the 9-point rating scale 
kn and the corresponding mean reliability for the ranking method was .76, The nine workers of 
A е second group had a mean reliability for the rating method of .76 and .79 for the ranking method. 
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judge of .65 for rating and .80 for the ranking procedure. The remaining judges showed no difference 
greater than .04 between the two reliabilities, 

The caseworkers indicated that some prefer one method in comparison to the other and there was 
no general consensus fayoring any one procedure. 


The Radiocarbon Calendar. W. Е. Lesy, University of Chicago. 

The use of natural radioactivities to measure geologic time dates back to the beginning of this 
century. The occurrence of radioactive carbon 14 in nature, due to the action of cosmic rays, provides us 
with an accurate calendar of man’s past, since its half-life of 5600 years coincides with the span of historic 
time. New measurement techniques, using the screen wall counter, have been developed to achieve the 
necessary sensitivity for this very weak activity. These techniques are described, together with some of 
the results already achieved with them. 


Ocean Surface Waves: An Analysis of Their Appearance, Propagation, and Properties in Terms of 

Power Spectra, Stationary Time Series and Statistics. WILLARD J. PIERSON, JR. 

A combination of classical hydrodynamics and time series theory is shown to give many observable 
properties of ocean waves. The combination permits adequate statistical descriptions of the waves and 
yields methods which permit the waves to be forecasted both in the storm area and after they have dis- 
persed out of the storm area, The waves are shown to be a quasi-homogeneous Gaussian process, 

A summary of the methods of classical hydrodynamics is given, and it is shown that the classical 
theory does not go far enough in a statistical and practical sense. The methods used by the geophysicist, 
based on averages, are summarized and shown to be inadequate. The early search for the spectrum of 
the waves is reviewed. Then the application of time series theory to defining a realistic power spectrum 
and adequate statistical parameters is given. Finally the results of classical hydrodynamics and time 
series methods are synthesized to obtain results of practical and theoretical usefulness. 


Probabilistic Study of Clustering of Galaxies in a Static and in an Expanding Universe. Jerzy NEYMAN 
AND Ештлветн L. Scorr, University of California (Berkeley). 
This paper reports on a study of the distribution of thecgalaxies conducted at the University of 
California, Expositions of the various stages of the study have been published from time to time, e.g., 
Astrophysical Journal, 1952 and 1953. ў 


A Factorial Design Applied to a Specific Chemical Process and Development Problem. Н. Gronsxorr 

AND Е. WiLcoxoN, American Cyanamid Company. 

The task of evaluating а chemical pilot reactor capable of processing 120 pounds of raw materials 
per hour was solved by the use of a design for four factors at two levels. Two replications consisting of 
4 four-trial blocks were run. 

An organic liquid and а gaseous feed stream were reacted in the presence of a sulfuric acid catalyst, 
The four scalar factors were: concentration of gas feed, reaction time, reactor pressure, and reactor 
temperature, A fifth variable was introduced by operating with one gas feed nozzle arrangement for 
one replication, and a different arrangement for the second replication. 

Differences due to replications, and therefore due to gas feed arrangements, proved negligible. The ` 
most important main effect was due to raw material concentration and the important interaction was а 
concentration-reaction time interaction. 

The experiment furnished reliable data for further plant design work and gave an efficient guide to 
best operating conditions. 


Non-Parametric Tests: Power Under Normality. W. J. Drxon, University of Oregon and University of 

North Carolina, 

The power efficiency function E is defined as the ratio of sample size of the t-test to the sample 
size of a test under question which have identical powers for a fixed alternative, б. This function is de- 
scribed for the sign test for paired samples of size 5, 10, 20 and for samples of five or less for the two 
sample tests: rank sum, maximum absolute deviation, median, total number of runs, The normal 
alternatives considered differ in mean value. 


General Review of Non-Parametric Methods with Special Emphasis on Randomization Tests. LINCOLN 
Е, Moszs, Stanford University. 
Non-parametric tests appropriate to experiments involving matched pairs are explored in some 
detail; Fisher's randomization of the observations, the t-approximation to this, the normal approxima- 
tion, Wilcoxon's signed rank test and the sign test are all considered and illustrated. The character of 
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inference in this framework is touched upon. The problem of which teat to use is posed. Various two- 
sample and k-sample tests are illustrated in less detail; among these are: randomization of the observa- 
tions, Wilcoxon-Mann-Whitney test, median tests, run test, analysis of variance by ranks (Friedman’s 
and Wallis-Kruskal’s). 


Components of a Difference Between Two Rates, Evetyn M. Krracawa, University of Chicago. 

This paper attempted a revised and systematic statement of the extent to which the difference 
between two rates can be accounted for by differences in the composition or structure of the two groups 
to which the rates refer. The difference between the over-all rates of two groups is separated into two 
major components, one due to differences in the composition and the second due to differences in specific 
rates of the two groups. The former major component is further subdivided into net subcomponents, 
each of which represents that part of the difference between the two over-all rates which is due to differ- 
ences in composition with respect to one factor independent of one or more other factors. For example, in 
a recent study of labor mobility, it was found that 65 per cent of the difference between crude mobility 
rates of Chicago and Los Angeles men was accounted for by differences in their age and migrant com- 
position; furthermore, the difference in migrant composition alone was responsible for virtually all of 
this 65 per cent reduction in mobility differentials, 


The Application of Mobility Research to Labor Supply Models. Davin Ronznre, Carnegie Institute of 

Technology. 

One of the more important factors giving rise to the current interest in constructing labor supply 
models is their relationship to the input-output analysis, That technique yields the dollar outputs of 
each of 192 industries, implied by any given bill of final goods. To carry on from there it is necessary to 
translate dollar outputs into labor requirements and to balance the latter with labor supply in terms of 
area, industry and occupation. Knowledge of labor mobility, including labor force participation in that 
term, is essential in order to pass from the known distribution of the labor force to that which may be 
expected under the postulated conditions. After the population has been projected by age, sex, etc. 
groups, the labor force participation of these groups must be estimated. Less familiar problems arise 
with the attempt to project the distribution of the labor force by area, industry and occupation. 

The question of the stability overtime of mobility patterns and rates is of obvious importance here. 
Many unresolved issues fall in this area. What occupational groups should be set up? Apart from the 
requirement of intra-class homogeneity and inter-class heterogeneity there is the question of potential 
job dilution and relaxation of efficiency standards which would alter class lines and mobility factors 
based upon thom. There is also the question of what determines the direction which movement takes. 
Is it accidental factors such as Proximity to plants, tips from friends, etc, or do many pepole have 
career patterns which they follow, etc. These and other problems must be explored further before it will 
be possible to веб up the type of labor supply models envisaged. In the meantime, models of the same 
type but necessarily less accurate and detailed can be constructed using data such as the 1950 census 
and the six-city mobility study which are now becoming available. 


On The Teaching of Statistics: Non-Parametric Methods in the Elementary Statistics Course. RALPH 
Bnaprzv, Virginia Agricultural Experiment Station of the Vixginia Polytechnic Institute. 


Radical changes in the teaching of elementary statistics are not recommended. Suggested reasons 
for the inclusion of work on non-parametric methods are (i) To provide easier ог clearer means of illus- 
trating basic principles, (ii) To add interest in those places where non-parametric methods are applica- 


discussed for each type of elementary course. An extensive bibliography is included with the paper. It is 
а fairly complete list of references on the teaching of statistics and related topics. 


Probabilistic Theory of Neural and Social Phenomena. Ахлтог Блроровт, University of Chicago. 


The problem of excitation spread is formally equivalent to a problem. treating the spread of a state 


SUMMARIES OF PAPERS AT THE 112TH ANNUAL MEETING 633 


(such as information) through a population. The computed time course of such a spread is computed 
with experimental data on message diffusions through school children populations. It is shown how the 
departures of the observed values from the predicted can be accounted for by imposing a “structure” on 
the net of contacts initially supposed to be random. This leads to a theory of population structure in 
terms of existing contacts and suggests modifications of the origina] theory. 

The implications of these structural considerations are discussed with reference to possible neural 
mechanisms responsible for the organization of behavior. ` 


Research Design of the Survey of Patterns and Factors in Mobility in Six Cities. MARGARET S. GORDON, 

University of California, (Berkeley). 

Despite recent heightened interest in mobility research, many important questions in this field re- 
main unanswered. The 1951 Six-City Occupational Mobility Survey (New Haven, Philadelphia, Chi- 
cago, St. Paul, San Francisco, Los Angeles) was an important step in providing more comprehensive 
data for analysis of labor mobility patterns and factors. Thestudy, attempted to answer the question: 
“Are there occupational, industrial, and regional differentials in mobility, of sufficient importance to 
affect manpower planning in a period of industrial mobilization?” While regional differences in mobility 
could not be directly analyzed, factors found to be responsible for inter-city differences may be regarded. 
ав clues to the probable nature of regional differences. 

Design of the study was characterized by four main features: (1) an enumerative-type interview 
with workers as the respondents; (2) а random sample of the entire labor force (excluding persons under 
25 years of age because of limited labor force experience); (3) analysis of civilian job changes on the 
basis of work histories during the period, 1940-1949; and (4) use of the Census Bureau's occupational 
and industrial code. 

Findings are preliminary, since the analysis is still incomplete. Like other studies, this one showed 
that mobility rates vary inversely with age and that a minority of workers account for most job shifts. 
Mobility rates also tended to vary inversely with the position of workers in the occupational ladder but 
did not vary significantly among broad industry groups, except for the construction industry where 
workers changed jobs relatively frequently. Job shifts were likely to involve & change in occupation, in 
industry, or in both simultaneously, but workers in certain ogeupation groups (professional workers, 
female clerical workers, and skilled craftsmen) were relatively unlikely to shift to other groups. While 
factors and patterns in mobility were strikingly similar in all six cities, average mobility rates varied 
considerably from city to city, with workers in Los Angeles and San Francisco displaying the greatest 
mobility, Differences in rates of in-migration were primarily responsible for these inter-city variations, 


Factors in Generation Occupation Mobility. Атвевт J. Rerss, JR., Vanderbilt University. 


This paper presents a statement and evaluation of & technique for the measurement of occupation 
mobility and applies it in the analysis of factors in generation occupation mobility in six American cities 
Occupation mobility is an index of the ease or difficulty with which individuals or groups acquire po- 
sitions or jobs open to competition in the labor force. Labor Force Demand Mobility is occupation move- 
ment due to changing demands of the occupation structure, If the size of the occupation groups changes 
over time, it follows that some intergenerational mobility occurs as men are recruited from declining 
occupations into expanding ones. Soaial Distance Mobility is occupation movement due solely to differ- 
ential evaluation of personal and social characteristics of workers. We need a measurement technique 
to distinguish between the two. The measurement technique is based upon the work of Goldhamer where 
social distance mobility is defined as the ratio between actual mobility and the amount of mobility we 
would expect: if there is no relation between the sons’ occupat ional destination and their occupational 
origin, The denominator of Social Distance Mobility is the expected value in conventional contingency 
analysis. Expected mobility values therefore represent the amount of movement that occurs if only 
availability factors influence occupation movement, One ‘unit is that amount of mobility expected were 
there no relation between fathers’ and eons’ occupational position. 

‘These ratios permit only a relative comparison of the amount of mobility between occupations, 
since their actual size is a function of the proportionate representation of that occupational class in the 
labor force, or, of the labor force demand factor. Attempts to construct an index based on these ratios 
which permits comparison between occupations have not been successful. This failure may be ascribed 
to а need to take into account change in the labor force demand factor between generations, as well as 
the size of the demand factor. 

The substantive question which the study analyzes is, to what extent do migration and educational 
attainment influence generation occupation mobility. The following conclusions were reached. (1) While 
migration, as compared with stability of residence, provides greater opportunity for persons to move 
within the broad occupational orientations of their fathers, it decreases their movement out of these 
broad occupational orientations. (2) Higher education provides roughly equal access to the non-manual 
occupations for men in all occupation groups while the absence of such education severely limits access. 


* 
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The Randomization Theory of Experimental Inference, Oscar Ккмтновмв, Iowa State College. 

The paper opens with a discussion of the need for a precise description of the role of randomization 
in the theory of experimental inference. In certain cases for example, normal law theory is called upon 
for the making of probability statements, while in other cases, the model which is used is definitely a 
finite one arising from randomization considerations, The paper is concerned with a restricted class of 
experiments, namely those in which various treatments are being compared. Some discussion is given 
of the criteria by which a theory of inference may be evaluated. After dealing with these preliminary 
questions, the paper is concerned with the basic patterns of comparative experimental designs, as 
follows: (1) the completely randomized design, (2) randomized complete blocks, (3) incomplete block 
designs, (4) Latin square designs and (5) designs in which treatments are applied in sequence to the 
experimental units. 

The essential assumption is that of the existence of a device which produces random numbers, 
The mathematical treatment is presented by the use of random variables which specify the distribution 
of the treatments on the experimental units. This method of description makes clear the nature of the 
inferences that are being made, and reduces the problems to the consideration of the distribution of 
(usually) simple functions of the random variables, The concept of additivity is defined and the role of 
additivity in experimental inference discussed. 

Finally a comparison of the merits of inferences based on randomization theory with the merits of 
normal law inferences is made. Also some points are noted about randomization inferences based on 
techniques other than the usual analysis of variance of observed values. 


Precision Measurements in Thermometry. Н. F. Srmson, National Bureau of Standards. 
The accuracy of temperature measurements on the International Temperature Scale, using plati- 


num resistance thermometers, depends on the. accuracy of realizing the fixed points and interpolating 
between them. It also depends upon the reproducibility, sensitivity, internal consistency, etc. of the 


Statistical Problems Encountered in the Programs of the Small Defense Plants Administration. 
Корввск Н. Вижу, Small Defense Plants Administration. 


business, but progress is being made toward one, in which statistically significant variations among 
industries are recognized, to replace the present, uniform test of 500 employees. Adoption by SDPA of 


; programs, as authorized by statute, would also hasten 
the achievement of greater uniformity in program statistics, which is sorely needed. 


г ! cal agencies to maximize usefulness of standard business series for 
small business analysis, such as through retabulation of Census establishment data by size of company 
and by product as well as by indi 


Organizational, Personnel, and Statistical 


Problems Facing the Neophyte Station Statistician. J. С. 
Dannocn, State College of Washington. s 


| 
| 
| 


жай. 
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others, differential sex response to treatment needs to be recognized as а real source of error. These are 
only а few of the problems mentioned. The review of manuscripts has been found valuable in the 
initial stages as а means of becoming familiar with the research problems in progress, the research 
personnel involved and as a basis for educational efforts. The degree of utilization of the statistical con- 
sultant is largely dependent upon the general level of statistical knowledge of the research group; thus 
an educational program is imperative before one can expect to be presented with really challenging prob- 
lems. 


Organization and Scope of Activities of Station Statistician. Савт Е. MarsHaLL, Oklahoma Agricultural 
and Mechanical College. 

This paper gives а picture of the experiment station's statistician іп the Land Grant Colleges of the 
United States based on letters of inquiry to the directors of our forty-eight Agricultural Experiment 
Stations, A summary of the forty-four replies follows: х 

Station statisticians are staff members of about one-third of our Land Grant Colleges. If there is 
more than one at a station, they are usually associated with some administrative unit such as a statistical 
laboratory or a department of statistics; otherwise they are attached to the director's staff or are mem- 
bers of some subject matter department such as agricultural economics, etc. If no statistician is desig- 
nated as station statistician, the consultant usually is a member of some department. In that case, he 
often cuts across departmental lines in rendering service to the station. 

The scope of activities of the statistician is very broad. Through his efforts to keep abreast of the 
many fields of research, he may act as а coordinator of research among the many departments. He is 
connected with the teaching program of the college, usually in the field of statistics. His services are 
extended to the in-service statistical training of the research staff. Computing services are available at 
most stations under the supervision of the statistician. There is considerable variation in the training 
deemed essential in fields outside of statistics. 


Research on Extent and Scope of Collective Bargaining. Krz В. Ретвнкк, Washington, D. С. 

The paper deals first with the inadequate knowledge of extent of coverage of collective agreements, 
the difficulty of picking в representative sample from what is essentially an unknown universe, and the 
shortcomings of such quantitative analyses of collective agreement provisions which are being made on 
this basis. 

Then methods of studying bargaining patterns are proposed which would throw light on the uni- 
formity or variation between agreements and the way different clauses develop and are passed from one 
agreement to another. Bargaining is explained as a year-round activity, of which informal accommoda- 
tions of daily problems are an important part, as are grievance and arbitration cases. In fact, they pro- 
vide the formal agreement with real content. Research into informal arrangements, into unofficial ге- 
ports of mediatora (state or federal) and into the substance of arbitration decisions illuminate uniquely 
process and contents of collective bargaining. Most of this research needs to be done. Finally, the sub- 
stance of bargaining scope, i.e. the limits of management prerogative, has been subjected to too much 
detailed research, rather than recognition as a matter of bargaining, hence pragmatically determined in 
each case. 


Comparison of the Means of Two Samples. Davip L. Watzace, Princeton University. 

Two samples of measurements of the results of а process are given. In each of the samples a different 
treatment was used. Procedures for comparing the effects of the two treatments are discussed. Interest is 
restricted to summarization of the results of the experiment by a confidence interval. For paired samples, 
procedures based on Student's t-statistic, the sign test count, Wilcoxon's signed-rank sum, Lord’s short- 
cut version of t using range, and Walsh’s range-midrange test are considered. For unpaired samples, the 
two-sample Student's t-procedure, its modification to allow for unequal variability within the two 
samples and, Wilcoxon’s two-sample count procedure are considered. Practical methods for constructing 
confidence intervals are shown for each procedure. The different procedures are compared according to 
the criteria of distribution restrictions necessary for validity, power, and ease of application. 


Production of Vital Statistics as a Combined Federal-State Operation. О. К. Sacun, Illinois Depart- 
ment of Public Health. 

Up tothe present the national vital statistics of the United States have been compiled from individual 
transcripts of birth and death certificates furnished to the Federal Government by the states. This results 
іп а considerable degree of duplication, since the same data are coded, key-punched, and tabulated at 
both levels of government. To eliminate part of this a procedure has been developed whereby the states 
may furnish duplicates of their punched cards to the National Office of Vital Statistics for national 
tabulations, The procedure was experimentally tested with the State of Illinois on births in 1950 and in 
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1951. Duplicates of the Illinois punched cards will be used in the production of national birth statistics 
for 1951 and after. Two additional states will submit punched cards for 1953 and others are planning 
this for 1954. One state is experimenting on the feasibility of furnishing pretabulated data in the form 
required for national statistics. 

This combined operation requires adherence to uniform definitions and interpretations, as well as 
consistency and accuracy in processing by the participating states and the National Office of Vital 
Statistics. The basis for such cooperation has been laid in the development of close working relationships 
over а period of years. Death data present particular difficulties in statistical processing because of the 
complications in cause-of-death coding. In time it is expected that national death statistics can also be 
produced by a combined operation such as has been started for births. 


Research on Response Errors. Ем 8. Marks, Bureau of the Census, 


Census Bureau research takes as basic the distinction between “response variance” and “response 
bias.” While measurement of bias is much more difficult than measurement of variance, results obtained 
Point to need for bias studies. Large response variance may be associated with either large or small net 
error (bias) in the Census. In the Post-Enumeration Survey of the 1950 Censuses (a reinterview study 
designed to check the accuracy of the Censuses), over one-third of the persons reported in certain cate- 
gories in the Census (e.g., 1949 individual incomes of $2500-2999, occupied dilapidated dwelling units) 
were reported as not in the category on reinterview. However, net differences between Census and re- 
interview results are, in general, small—less than 10 per cent in most cases. It is quite possible for a 
large proportion of the individual reports to show substantial reporting errors without any significant 
effect upon the distribution of the entire population or upon the conclusions that might be drawn from 
the data. On the other hand, a consistent error in reporting, even though it affects only a small propor- 
tion of the individual reports, may result in substantial distortion of the distribution of the entire popula- 
tion and of conclusions based upon this distribution, 


The Distribution of Government Burdens and Benefits. Воров Tucker, General Motors Corporation. 


No sound judgment can be formed concerning either the equity or the economic burden of a tax 
system unless the distribution of the benefits of government activity financed by the tax system is also 
studied, This paper is limited to distribution by income classes, 

' The redistribution of income was accelerated between 1920 and 1948 and for this the progressive 
nature of our tax system was partly responsible. But the increasing tendency to spend government 
money for the benefit of the lower and middle income clasaes was more responsible, 

The burden of all taxes, direct and indirect, rose from under 12% of income in 1929 to over 27% in 
1948. The burden on the poorest tenth rose from under 9% to nearly 17%; the burden on the wealthiest 
one-hundredth rose from 18% to 51%. 

Although the average income. per spending unit, measured in constant dollars and after deducting 
income taxes, rose 36% from 1929 to 1948, the average income of the top one-fifth of spending units, 
measured in the same way, fell 10%. 

Some government expenditures are plainly for the direct and sometimes exclusive benefit of certain 
classes and can be allocated to income classes with fair degree of confidence, Other expenditures are 
for the general welfare, and can be allocated with equal logic on the basis of consumption or income or 
property, or per capita, We find that the poorest half of the population has been receiving more benefits 
from government that it has paid for, while the wealthiest one-tenth has been paying for much more 
than it received. 

In 1948 the ten per cent of consumer units that received the lowest incomes received 2,3% of all 
income, paid 1,4% of all taxes, and received between 3.9% and 7.0% of all government benefits. The 
wealthiest ten per cent received 31.3% of all income, paid 45.3% of all taxes, and received between 
14.2% and 31.4% of all government benefits, The figures for preceding years show similar relationships, 
with greater inequality of income and less progression in taxes. 

Our system of taxation is progressive against income, and even more progressive against benefits 
received. It has already reduced the disposable income of the wealthiest twenty per cent of our people 
(those belonging to consumer unite with incomes over $5000 in 1948). It is time to consider seriously 
whether higher taxes on the wealthy, or higher taxes in general, may not reduce the total national in- 


come, or at least prevent its growth, with other undesirable economic, political, and moral conse- 
quences, ^ 


General Equilibrium Aspects of Incidence Theory. Взсндвр A. Muscnavz, University of Michigan. 

In formulating incidence theory, it is necessary to define what is meant by incidence and effects of 
taxation. I propose that the former be defined as the change їп the distribution of real income by size 
income brackets, the latter as the change in the level of real output which results from the economy’s 
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adjustment to а change in budget policy. А further important distinction is between absolute incidence 
where a tax situation is compared with a no-tax situation, and differential incidence where the respective 
results of two taxes providing equal yield are compared. For various reasons the latter approach is 
preferred. 

The problem of general equilibrium in incidence analysis may be demonstrated with regard to the 
incidence of excise taxes. The conclusions are: (1) It is only of minor importance whether the initial 
adjustment to the tax takes the form of increasing the price of the taxed commodity while holding factor 
payments constant, or of decreasing factor payments while holding the price of the taxed commodity 
constant, What matters is the resulting change in the relative market prices of consumer and of capital 
goods. (2) The result of an excise on consumer goods only will be to raise the price of consumer goods 
relative to the price of capital goods, whatever be the change in the absolute level of prices. Such a tax, 
therefore, will fall on the consumers. Since consumption expenditures decline as a percentage of typical 
family budgets when moving up the income scale, the incidence is regressive. (3) The result of an excise 
on capital goods only will tend to leave relative prices of capital and consumer goods unchanged. Such a 
tax, therefore, will fall on both consumers and savers (buyers of capital goods) and tend to be distributed 
proportionately. However, a general retail sales tax is largely a tax on consumer goods and it is justified, 
therefore, to impute such a tax to the purchasers of the consumer goods. 

The above argument involves certain simplifying assumptions. In particular, we have disregarded 
possible resulting changes in the distribution of money earnings, and possible resulting changes in total 
output. However, there are reasons to expect that such changes will be distributionally neutral and that 
they may be disregarded, at least in a first approximation to the problem. 


Exposition of Straight Line Fitting Methods. Блснлвр Е. Linx, Princeton University. 

Suppose 7 = А +B £ describes the relationship between two variables (£, 1). The problem of estimat- 
ing and setting confidence limits on A and B given paired estimates (x, у) of (Ё, 7) is discussed. 

The classical case of z measured without error and y=7-+w, E(w) =0, var (00) =o7is discussed. The 
classical least squares and the Nair and Shrivastava procedures (Sankhyà, 6, 1942) for obtaining 
estimates of A and B are illustrated. The relative precision of the estimate of B for the two procedures 
is indicated. Confidence limits are found for A and B by classical methods assuming y to be distributed 
according to N(, 07). Confidence limits for A and B are also found using short-cut techniques involving 
the use of the range. 

"The case of z measured with errors is discussed. The additional assumptions and information neces- 
sary for handling this type of data are indicated. In particular the use of an instrumental variate is dis- 
cussed, 

The methods for estimating A and B under these assumptions proposed by J. W. Tukey (Biometrics, 
7, #1, 1951) are illustrated. The procedure for obtaining confidence limits for B with these methods is 
also illustrated. The Nair and Shrivastava procedure is also illustrated for these assumptions. 


Broadening the Significance of Vital Statistics Through Special Studies. Pau M. DzNszN, University 
of Pittsburgh. 

In the past, routine vital statistics of births, deaths and infectious diseases sufficed to answer many 
of our statistical needs, These vitgl statistics often pointed up the need for more detailed investigations of 
the problems and much of value was learned through related studies. Full significance of vital statistics 
has always required the evaluation of special studies. 2; 

Routine vital statistics continue to point out where our problems lie, but they no longer provide 
sensitive indices to the magnitude of the problem and to the effectiveness of efforts to improve the health 
of the population. Neither our mortality statistics nor our notifiable disease statistics give any indication 
of the magnitude and distribution of diseases of long duration or those with low mortality rates and high 
disability rates. Such information as we do have comes from special studies! of one kind or another, 
particularly morbidity surveys. 

Vital statistics have made it clear that the character of our public health problems is changing: 


There are several reasons why such evaluation is essential to ‘continued progress in public health: 

(1) The influence of public health programs on the health of the population is far more subtle than 
it used to be and more sensitive measuring instruments are needed to measure this influence. It is these 
relatively more subtle changes that must be measured if we are to continue to interest the public in sup- 
porting public health programs. We can only do this when we are in а position to produce direct evidence 
of the effect on health of specific procedures. Such evidence can only come from carefully controlled 
special studies. 

(2) The unit of operation of public health today is becoming increasingly the individual rather than 
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the population as а whole. It сап no longer be ssid that if a local health department is available there 
will necessarily be less diphtheria, less typhoid, fewer maternal deaths, because these are very minor 
problems today. On what basis then shall expenditures for health work be justified? The objectives of 
present public health programs must be examined and studies designed to evaluate achievements in rela- 
tion to these objectives if such justifications are to be found. 

The several functions of public health statistics are best served by a combination of routine vital 
statistics and special studies, It the necessity for such a combination is to be recognized and provided 
for administratively, administrators and program directors must come to realize the basic quantitative 
nature of the problems facing them in the development of public health policy, A demonstration is 
needed of the way in which the statistical approach can help the administrator in the formulation of 
policy. - 


Improving Marriage and Divorce Statistics. Huan Carter, Public Health Service, Federal Security 
Agency 
‘This paper is published in full elsewhere in this issue, 


Pension Plans—The Concept of Actuarial Soundness, Роввлков C. Bronson. 

As a minimum for actuarial soundness, the employer should currently fund the pension credits 
applicable to the years elapsing after the plan’s inception and should, by retirement age, have funded 
the past service credits for the then retiring employees, These definitions will not satisfy all parties; at 


origin, isto make money for the employer of, in some instances, to earn increased pensions for employees. 
"This purpose is illustrated by an investment policy for the fund aimed at substantially higher yields, or 


A retired public employee cannot feel too secure where little or no fund stands back of his it 

n ^ E pension. The 
federal Civil Service Retirement Act, while having sufficient funds, perhaps, for the existing pension 
roll and for accumulated employee contributions, does not have any asscts—but only the taxing power 


The dangers of inflation are discussed іп making meaningless an achievement of actuarial soundness 
and in making difficult, if not impossible, continued rounds of higher benefits under the plan for the 
юш. reed roll at any time, Perhaps only Social Security can raise benefits “across the board” 
fs оп. The Paper questions whether the country can really stand actuarial soundness in 

e large accumulation of investments which this would mean. If all employers with 50 or more em- 
уш were to se up actuarially врода plans, it might entail reserve amete appe 0 ог шого еше 


= 
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national debt. Would our capital plant and its productivity keep pace with the savings necessary to 
represent this degree of actuarial soundness? 

Finally the paper discusses the consequences of lack of actuarial soundness—citing the Railroad 
situation prior to the Railroad Retirement Act—calling attention to the coal fund and raising the ques- 
tion as to what extent, if funds are not sound, an ultimate bailing out by nationalization of the pension 
plans might ensue. Fortunately most industrial pension plans are proceeding on a basis which would not 
bring them to these straits, but this, in the main, has been during a period of ready money and high 
tax rates so that whether these indicated good intentions of making contributions on an actuarial basis 
continue, in a different economic milieu, is one of the questions for the future. 


Labor's View on Actuarial Requirements for Pension Plans. Ботомом Banxrw, Teztile Workers Union 
of America, CIO. 

Trade-unions, early in the days of the current pension movement, directed their attention to 
establishing benefits for the superannuated and those about to be retired. The first plans centered around 
providing retirement benefits of $100 for employees with 25 years of service with a company, with re- 
duced amounts for employees with lesser service, The unions relied upon the financial solvency of the 
particular company in the years ahead to meet the obligation. 

‘As collective bargaining developed, there was an increasing emphasis upon fixed cost plans with: 
defined benefits, Both management and unions favored these programs since they provided a more 
determinable basis for negotiations. Four types of fixed cost plans were evolved: (1) a fixed hourly rate; 
(2) a defined obligation to meet the cost of current service and amortization of past service over a defined 
period; (3) fixed percentages of payroll; and (4) fixed charges on units of output. Unions have also 
identified the contribution as a form of wage payment. With the fixing of the contribution, they insisted 
upon the separation of the sums into trusteed funds. Unions have disapproved of profit-sharing systems 
as methods of financing these funds, as they do not provide fixed rates of contribution. 

The segregation of the pension funds and the fixing of the employer's rate of contribution have 
opened up opportunities for determining the best utilization of these funds, Unions are promoting the 
establishment of worker rights to the benefits, even if he does not remain with the company the full 
service period required for the full benefits. In newer industries, the early vesting privileges are being 
combined with provisions for separation pay to enable employees to arrange their transfer to new jobs 
with greater ease, 

Other developments may be noted: 

1. The employer's rate of contribution has been increased as the pension funds have increased their 

benefits. 

2, The increase in federal Social Security benefits has led to improvements in the benefits received 
since many plans provide for the worker to share all or part of these improvements without re- 
duction of their benefits under the private pension. 

3. Unions have insisted that the actuarial gains resulting from the use of low interest and turnover 
rates and assumption of early retirement be kept in the fund. 

4. Unions are increasingly favoring self-administered and self-insured plans, 

5. Cost-of-living adjustments have been discussed in some negotiations. 

6. More adequate provision for the disabled is receiving attention. 

7. Benefits have been made ‘more liberal for employees of longer service and the higher wage 
brackets. 

Funds built on conservative principles have shown impressive actuarial gains. The increased com- 
prehension of this entire mechanism by trade-union leaders has also provided a base for adapting the 
benefits to the peculiarities of different industries and groups of workers, as well as improving them so 
that actuarial re-evaluations are necessary. 

Unions are aware of the need of keeping “sound” pensions funds. But they are armed with the ex- 
perience that the rate of contribution is not necessarily fixed. In an expanding economy actuarial gains 
and higher contributions provide the base for improved benefits. Pensions should be adequate to enable 
those unable to work, to retire voluntarily with reasonable financial security. These programs must not 
be the vehicles for forcing the retirement of older workers. Self-administration by joint committees of 
management and unions affords the greatest opportunity for promoting realistic collective bargaining 
and effecting changes which will better realize the purposes of the plan. 


Some Recent Developments in Canadian Statistics. HERBERT MARSHALL, Ottawa, Canada. 

The paper describes some of the new methods used in taking the 1951 Population and Housing 
Census of Canada. These included a much greater mechanization of the Census operations. А *mark- 
sense" document was used in the field and these, when completed, were run through а new document 
punch machine to produce a punched Hollerith card This and other new methods are resulting in & 
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reduction in the time normally required for completing а census by one-half and in large monetary 
savings. 

Developments in the field of Industrial Statistics include a revised index of industrial production in 
which the concept of “net output” is used as a current indicator for numerous industries instead of gross 
value of production. Other developments consist of sample surveys furnishing information adequate to 
permit projecting annual statistics on a monthly basis for sales and inventories, thus furnishing current 
indicators. 

In Health statistics the main developments have been the undertaking of a sample survey of sick- 
ness in the general population and the organization of a much more adequate system of hospital sta- 
tistics. 

In Agriculture experiments are being made for obtaining current statistics in certain sectors of the 
field with probability sampling based on new information secured in the 1951 Census of Agriculture. 

‘The new Consumer Price Index put out by Canada in October 1952 was the result of three years of 
preparatory work. The methods used include numerous departures from those used in the Cost-of-Living 

Index which it replaces, Two important developments were the introduction of a method to adjust 
for seasonal changes in the consumption pattern for certain foods and the inclusion of a measurement 
of home ownership costs. 


Some Recent Applications of Statistics in Australia. Mavrice Н. Berz, University of Melbourne. 


At the National University in Canberra, a Department of Mathematical Statistics has been 
created, with research and advisory functions, while in the Universities of Melbourne and Adelaide 


versities by the several statisticians attached to the various Departments of Mathematics, Some outside 


In the research field, engineering and agricultural experiments have introduced modern statistical 
techniques, such as factorial, split-plot, partial replication and multiple regression procedures, These 
have been employed in the gasification of coal, production of hard carbon from brown coal, briquetting 
of brown coal, traction research on various agricultural machinery, analysis of rainfall data, prediction 
of rainfall, secular variation of rainfall, ecological investigations, biological assays, etc. A considerable 
interest is also being displayed in medical statistics, both in Sydney and Melbourne. The Common- 
wealth Scientific and Industrial Research Organization continues to expand its statistical activities in 
almost all of its Divisions and Sections from Plant Industry to Tribophysics and from Forestry to Ani- 


3 Research in Mathematical Statistics is being pursued in the various centers, often inspired by prac- 
tical problems presented by the experimenter, for example, tasting experiments, missing values in certain 
complicated designe, separation of chemical solutes, and distribution of chain molecules. 


work in this field were developed for Japanese industry as a 
: Тарап and was extremely influential in the quality control 
work in Japan. (2) The Japanese Standards Association organized a statistical quality control com- 
mittee, in order to develop some standards in quality control methods, 

In the fall of 1951, we had within Mitsubishi organi 
and eleven part time workers for statistical 


Hin quality control was estimated at roughly 0.575 of our current sales volume of $28,000,000 in 

‚ Many firms іп control programs more or less similar to ours. 
success and failure, Probably one of the fundamental 
їз lack of tight but flexible tie between statistics and 
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Statistical Organization and Estimates of Crops in West Bengal, India. М. CHAKRAVARTI, State Statistical 

Bureau, Calcutta. 

Statistical work of the West Bengal Government is centralized in the State Statistical Bureau, which 
serves as coordinator of data processing, standardizer of reporting forms and publisher of all routine 
statistics. All statistical surveys are designed and executed by the Bureau. Examples of topics covered 
are: annual industrial census, acreage and yield estimates for crops, price indices, cost of living indices, 
irrigation and hydroelectric benefit assessment rates, and family budgets. Special ad hoc surveys are 
also made; examples are: living characteristics of middle and lower class families in order to set minimum 
wages for government employees; refugee population counts; relationship between rental values and tax 
assessment values; morbidity, birth and death data. 

Crop sample surveys are used to implement the government food control and rationing program 
Since movement of crops from one district to another is prohibited, accurate data on district production 
is required. Sample surveys involve sampling units of about 2 acres selected systematically at random at 
intervals of half a square mile of cultivated areas, The sample is divided into two subsamples on the 
basis of odd and even numbered sampling units. Investigator variance and bias is estimated by replica- 
tion for fifteen per cent of the sample units. Special precautions are taken to revise estimates in the light 
of subsequent crop disasters, caused by flood, pestilence or weather. 


Testing the Homogeneity of Treatment Means in an Analysis of Variance of Engineering Data. D. B 

Duncan, Virginia Polytechnic Institute, 

This is a discussion of several methods recently proposed for testing the significance of differences 
between treatment means in an analysis of variance. The methods include, (А)  studentized range 
testing procedure, Newman (1939) and Keuls (1952); (B) а multiple F testing procedure termed the 
Multiple Comparisons Test, Duncan (1947, 1951); and two test procedures (C) and (D) given by two 
confidence interval methods Tukey (1952) and Scheffe (1952) respectively, 

The basic differences between these procedures are classified and illustrated graphically in a simple 
5% level case involving only three means. The most important difference is the use by A and B of a 
successive-tests principle not used by C and D. This makes A and B considerably more powerful than 
C and D, without any inappropriate increases in error rates. The second difference concerns the relative 
significance level of individual tests in the successive test procedures A and B. B is more powerful than 
‘A through using a special system of levels, the validity of which is briefly discussed. Other less im- 
portant points of difference are also illustrated. A separate section is included entitled “The Multiple 
Comparisons Test Extended to the Problem of Separating Treatment Means With Unequal Replica- 
tions.” 


Some Applications of Statistics to Research in Time and Motion Study. Н. C. Sweeny, Virginia Poly- 
technic Institute. 

Time and Motion Study is a procedure for determining the time required by an “average” operator, 
working at a normal tempo, to accomplish some task. At the present time, this procedure is more of an 
art than a science and, as such, contains many aspects which on close examination seem questionable, 
The use of statistical techniques in the past in both research and application of Time and Motion Study 
has been limited, and in the few cases wherein advanced techniques have been used, the appropriate- 
ness of the model appears questionable. This paper reviews the special problems inherent in research and 
application of Time and Motion Study procedures, and discusses the use of statistical techniques to 
these problems. An example is given using data from an experiment to emphasize the problems inherent 
in research of this nature. 


The Elements of an Industrial Classification Policy. Waur В. біммомв, U. S. Bureau of Labor Statistics. 
This paper is published in full elsewhere in this issue. 


The Use of Statistical Techniques in the Accounting Department of a Large Manufacturing Company. 

D. A. Тлутхавтом, Monsanto Chemical Company. 

There has been а growing demand for statistical techniques in the area of accounting communica- 
tion, Charts, tables, and statistical reports not only streamline presentation but they minimize the time 
required by management to locate significant relationships and trouble spots. 

The work which the Statistician performs in the accounting department of Monsanto Chemical 
Company is: 1. Preparation of financial and operating chart series. 2. Periodical statistical reports and 
special studies. 3, Construction and presentation of indexes of selling and raw material prices. 


Financial and Operating Charts 
Our chart program is composed of three basic series which are directed to the top levels of manage- 
ment: The first of these series, the Director's charts, consists of six charts which are graphic income 
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statement summaries prepared quarterly to cover the operations of the Company. The Executive Com- 
mittee chart book series consists of about 110 annual and monthly charts and the accompanying 
tabular data. These charts highlight the operations of the Company and compare Monsanto's results 
with other leading companies in the chemical industry. For each of the seven Divisional Managers, 
there is а divisional chart book, which has between 30 to 37 monthly and annual charts and tables which 
review the operations of a division. 

The first or pivotal chart in the series shows the return on average investment ratio since it is an 
accepted Company policy to consider return on investment employed in the business as a prime measure 
of management effectiveness. Charta depicting the ratios of the various factors from net sales to net 
income considered with other ratios related to average investment shed light on the causes underlying 
variations in return, и 


Regular Periodical Reports and Special Studies 

Among the regular periodical statistical reports are a percentage of actual to rated capacity opera- 
tions report prepared monthly, a comparison of inventories by division and by inventory classification 
which is also issued monthly, and quarterly and annual comparisons of Monsanto's operations with 


time is one which traces in detail the income and financial growth since 1936 of the top seven chemical 
companies, $ 

The responsibility for the initiation of any report and for its form and content resides in the Comp- 
troller. One of the main duties of the Statistician is to propose new areas needing investigation, 


Indexes of Raw Material and Product Prices 


Monsanto has had selling prices, raw material, and wage indexes which extend from and were 
based in the year 1939, These indexes have been published regularly in the Company annual reports, 
А was decided this year to revise the present indexes and convert them to a 1947-1049 average price 

аве. ] 
The financial and operating charts, reports, and the price indexes have been tailored to one major 
objective: the needs and understanding of management. They are steps in a program designed to utilize 
statistical techniques for better communication of information to management. 


Discussion of Congressional House Committee Report of the Investigation of the Federal Crop Report- 
ing Service. Тоны J, Нымвововв, Counsel, Committee on Agriculture, House of Representatives— 
‘Summary of House Committee Report.” J. Rocer WarLacg, New York Journal of Commerce— 
“Forecasting and Estimating the Cotton Crop." Jonn D. Baxrm, Longstreet- Abbott Company, 
St. Louis—"Evalustion of Forecast of the Wheat Crop.” LAUREN Sora, Des Moines Register and 
Tribune—*Needed Improvements in Estimating the Corn Crop,” 


(A consolidated abstract.) 


‘The methods of mail sampling to a non-probability list sample of reporters and graphic regression 


estimating Procedures are the same in principle for cotton as for all other major crops, except that be- 


trend is indicated: Deviations for the August ге 
ports are down nearly 40 per cent; for September less 
(Бал 30 per cent; and for October and December about 50 per cont, The smaller decrease in the devia- 


| 
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tions of the September, as compared with the August, reports is surprising, ns the cotton crop is usually 
«pretty well made" by September 1. 

More than one-third of the deviations of the 1951 national estimates of cotton production from final 
ginnings was caused by seriously overestimating the cotton acreage in cultivation on July 1. 

Winter Wheat: The official reports on winter wheat production for comparable periods of plant 
development are less reliable than for cotton—both as to magnitude of the deviations between monthly 
reports and final estimates and as to constant bias (underestimation). 

There has been some slight, though not significant, improvement since 1922 in the accuracy of the 
winter wheat crop reports when they are evaluated against the final revised estimates of production 
(1922-30 versus 1941-50). 

‘Spring Wheat: The percentage deviations between the reports of spring wheat production and the 
final revised estimates are significantly larger than is the case with winter wheat. An analysis of the 
record (1922-30 versus 1941-50) shows no decrease in magnitude of deviations of monthly erop reports 
except for the August report or in the tendency to underestimate production. This downward bias, how- 
ever, is not as great as with winter wheat. 

Corn: No satisfactory evaluation can be made of the reliability of crop reports on corn production, 
as no independent check data on production are available for comparison. In view of the wide variation 
from year to year in the moisture content of corn and, consequently, in feeding value, there has long 
been a demand for estimating corn production in terms of a constant moisture percentage, "There is also а 
demand for bi-monthly crop reports for corn. 

It is recognized that there are obstacles to forecasting accurately the out-turn of a crop. Weather, 
plant diseases, changing crop varieties, reports obtained from farmers and others, which often are 
biased by attitudes, all complicate the problem. The fact that crop forecasts have long been made by 
the government and by private forecasters in face of these difficulties indicates the need for them. 

Great reliance is placed upon the government crop reports by the trade, by processors, by farmers 
and by government action agencies. The greater the accuracy of official crop reports the smaller the 
element of risk that must be borne by the buyers and processors of agricultural products, and the nar- 
rower the price margin between the farmers and consumers. 

The methods used by the Crop and Livestock Reporting Service have not kept pace with develop- 
ments in scientific sampling of the last 15 years or in crop-weather relationship research extending over в 
longer period. 

Crop reports, issued on the 8th to 10th of the month, are based upon crop conditions reported 
largely by farmers over a several day period, ending on about the 2nd of the month. Traditional operat- 
ing policies (made mandatory by law in the case of cotton) prevent use either of readily available weather 
information for the first eight to ten days of the month or the official five-day weather forecasts, 

From the standpoint of the effect of production upon prices of the major crops, the crop reports of 
national production are of paramount importance, with regional estimates next in effect. Traditionally 
however, the primary objective of the government service has been to provide accurate state estimates, 
From the standpoint of the reliability of a sample, the variability of the phenomena being sampled, both 
in space and from year-to-year, is nearly as great within any one state as it is for a geographic region or 
the entire country. Consequently, nearly as large a sample is required for any one state as for the entire 
country for a specified level of sampling precision. 7 

A full appreciation of еве Ъавіс principles should lead to the adoption of the more realistic and 
useful primary objective of national estimates of maximum reliability for the major agricultural prod- 
ucts, Unless this is done, any moderate increase in appropriations of a few hundred thousand dollars is 
unlikely to result in any significant increase in accuracy of crop and livestock reports. 

‘Assuming that these “institutional” factors could be corrected and that the primary objective of 
the Crop Reporting Service could be made more realistic, there are two general lines of approach, from 
а methodological standpoint, that give definite promise of increasing the accuracy of official crop and 
livestock reports. These two approaches could be adequately implemented with an increase in appropria- 
tions of not more than 20 to 25 per cent. " 2 

Since the beginning of the Crop Reporting Service, шай sampling has been practically the only 
method of sampling used in crop and livestock reporting. у 

Area probability sampling and objective sampling of plant characteristics could be used to tre- 
mendous advantage in strengthening the mail sampling program by placing a sound foundation under 


should be implemented. A June survey, however, would provide а powerful means for improving the 
accuracy at the time of the year when it is most needed —namely, for acreage estimates which are used 
in connection with all the crop reports from July until the final December crop report. 
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Probability pre-harvest field sampling of crops for the purpose of determining yield per acre and 
quality or moisture content are essential when reports from farmers are subject to considerable under- 
Statement bias that is not constant from year to year. 

Certain private crop forecasters have found that cotton ginners are more reliable crop reporters for 
cotton than farmers; operators of local mills and grain elevators are better than farmers for reporting 
on wheat, 

If all information concerning crop conditions and weather available at the time crop forecasts are 
made, including the official five-day weather forecasts, were utilized statistically, the accuracy of fore- 
casts of crop production would undoubtedly be increased, especially during periods of critical weather 
conditions and plant growth and development. The date to which the forecast relates would be ad- 
vanced by 10 to 12 days over the present system. 

The considerable amount of research as to the relationship of weather to crop yields, conducted 
during the late 19308, demonstrated that weather factors, as well as soil moisture, could be used 
statistically, along with the reported condition of a crop, to increase the accuracy of forecasts of crop 
yields per acre during the growing season. None of the results of this research are being used by the 
Service at this time. 


The Mathematical Biophysics of the Cardiovascular System. Свовов Karreman, University of Chicago. 
The propagation and reflection of pressure waves in a fluid enclosed within an elastic tube are 


determinations on other clinical conditions, аа e.g., arteriosclerosis, it is shown that valuable information 
about the degree of deviation in the thickness of wall or elasticity modules might be obtained. 


A Mathematical Theory of Capillary Exchange as a Function of Tissue Structure. Сковок W, Вснмірт, 

University of Chicago. 

An equation is developed giving the concentration of the venous blood in terms of the concentration 
of the arterial blood, the blood velocity, the capillary permeability, wall area, and density, the diffusion 
coefficient of the interstitial matrix and other tissue parameters, A discussion of the relative influence of 
the various parameters upon the exchange rate is presented, 

Equations are deduced giving the mean extra-cellular and the mean capillary concentrations in 
terms of the concentration of the arterial blood and the various tissue parameters. Tne assumption 
used by some experimenters, that the mean capillary concentration is approximately equal to the ar- 
terial concentration is shown to be generally invalid. 

Consideration is given to the kinds of experiments which would be, useful in testing the validity, of 
some of the assumptions and approximations of the theory. 


BOOK REVIEWS 


Facts from Figures. M. J. Moroney. Baltimore, Maryland: Penguin Books 
Inc., 1951. Pp. 472. $1.25. 


M. A. бінвніск, Stanford University 


HAYE used this book as a text in an introductory course in statistics, I 

have also had time to think about its value as a popular treatise on statis- 
tical inference. The result is both enthusiasm and disappointment. 

Moroney has written a truly remarkable popular exposition on what 
might be called classical statistics, and, in the process, almost caught the 
modern spirit as well. The book touches on practically all standard statistical 
techniques, ranging from graph construction to analysis of variance and 
covariance, No effort is spared by Moroney to make the techniques available 
to the reader. With each new technique he gives step by step computational 
procedures so that the mathematically untrained can more easily follow the 
meaning of the formulas and symbols. Because he has not entirely succeeded 
in conveying to the reader the modern concepts of statistical decision making, 
however, his book does not fulfill the need for a good elementary text in 
statistics, Neither does it manage adequately to impart to the intelligent 
layman and to scientists in other fields the fundamentals of the modern 
theory of statistics. 

There are, in my opinion, at least two prerequisites for writing a popular 
treatise on any scientific subject. One is a good style and, if possible, a sense 
of humor of the kind that Moroney possesses. The other is a deep understand- 
ing of the subject. No other form of writing tends to expose conceptual weak- 
nesses as glaringly as non-technical expository writing. This is particularly 
true in a new field such as statistics where the concepts are still fluid. In 
fact, one could venture a guess that no statistician ever learns how much of 
his statistical knowledge is nebulous until he attempts to write a popular 
version of what statistics is or give an elementary course covering the 
fundamentals of statistics. 

Style is not Moroney’s difficulty. One is almost envious of the ease with 
which some complicated statistical concepts get unfolded and explained. His 
humor is delightful. He employs ridicule effectively, but not offensively, 
since it is seldom directed against statistics, but rather against its misusers. 

A clear understanding of modern statistical concepts does appear to be a 
problem to Moroney. In this respect, he seems to possess a split personality— 
an ailment common to many statisticians. One part of him is the industrial 
statistician. In this capacity he portrays with lucidity and insight the 
main features of statistics as a guide to action and a tool in decision making. 
The other part of him is the classical statistician. In this role his portrayal 
of statistical ideas becomes somewhat muddy. Here, by and large, decision 
making gives way to a ritual known as performing tests of significance. As 
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an industrial statistician he considers decision rules which lead to the 
acceptance or rejection of lots. As a classical statistician his decision rules 
lead to the acceptance or rejection of Null Hypotheses. Since, clearly, an 
hypothesis is conceptually on a higher plane of being than a mere lot, there 
is less of the know-how and down-to-earth flavor in his discussion of tests of 
hypotheses than is found in his discussion of acceptance inspection. Again, 
as an industrial statistician he insists that a tule, such-as an acceptance 
inspection plan, must be evaluated by its Operating Characteristics. Not so 
in testing hypotheses. Here he no longer tells us to inquire what the conse- 
quences of a particular decision rule will be as a function of the possible 
states of nature. Instead, he recommends that we ascertain, by consulting 
ап appropriate table, whether the result of the statistical testis ^... ‘Prob- 
ably, significant,’ ‘Significant,’ or ‘Highly significant,’ depending on the 
probability level associated with the judgment” (page 218). This jargon is 
odious to him also—but not, unfortunately, the underlying idea, since 
admittedly he has no substitute for it. His way out is to claim that ^... there 
can never be any question, in practice, of making a decision purely on the 
basis of a statistical significance test. Practical considerations must always 
‘be paramount” (page 218). But shouldn’t the practical considerations be 
incorporated in the statistical test to begin with? 

At this point it is probably clear that the criticism I am making is in 
reality directed at every elementary statistics book on the market, And most 
such books do not treat classical Statistics half as competently as Moroney 
does in Facts from Figures. It is unfortunately true that the material going 


evaluated by their consequences. These consequences are expressible in 
terms of risks, or more intrinsically, in terms of the probabilities of taking 
the various permissible actions which are induced by the experiment, deci- 
Sion rule, and thé possible states of the System. In brief (and with due 
apologies to Moroney), not facts from figures but rather decisions from 
observations should become the main emphasis in elementary statistical 
Observations, 

The insistence that we discuss and display, as ingredients in а decision 
making situation, the unknown states of nature, the possible experiments, 
the available decision rules, and consequences of these rules as a function 
of the unknown states is, in my opinion, a primary prerequisite for any 
intelligent approach to statistics. I ата convinced that had Moroney been 
aware of this and understood it he would have written the elementary book 
that is needed. In addition, he would have avoided many fundamental 
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conceptual mistakes as, for example, confusing parameters of distributions 
with statistics (Chapters 4 and 5), performing a two-sided test of a hypothesis 
when a one-sided test is called for (pages 222 and 228) or explaining the exist- 
ence of two regression lines by the fact that 4... when we estimate y 
from a given value of z, it is the sum of the squares of the discrepancies in y 
which has been minimized. When we estimate z from y, it is the sum of the 
2 discrepancies which have been minimized." (Presumably, we need only to 
change our method of estimation in order to abolish the existence of the 
two regression lines.) One could take issue with Moroney on many other 
statements found in the book, but I believe they all flow from the same 
fundamental conceptual weakness. 

Nonetheless, this little book is a joy to read and can be highly recom- 
mended to mature statisticians for the sheer fun of reading it; and to students 
for background material. 


An Introduction to Statistics. Charles E. Clark (Associate Professor of Mathe- 
matics, Emory University, Georgia). New York: John Wiley & Sons, Inc., 1953. 
Pp. x, 266. $4.25. 


7. S. Маямоззкт, University of Connecticut 


07 of 218 pages of text approximately 18 pages are devoted to concepts 
of descriptive statistics before a final chapter of 30 pages on simple 
correlation, The only subject matter taken up in the field of descriptive 
statistics is the frequency distribution, the arithmetic mean, the standard 
deviation, the histogram and correlation. Questions and answers help to de- 
velop the frequency distribution and histogram more completely but prob- 
ably not enough to prepare the student adequately to criticize the mid- 
values in the table used on page 140. These aspects of descriptive statistics 
(except for correlation) are introduced only because they are used in the 
text to develop statistical inference. In the final chapter on correlation, how- 
ever, inference is not emphasized at all. No mention is made of the reliability 
of the coefficient of linear’ correlation nor of the standard error of estimate. 
This seems unfortunate in a text which is so obviously an introduction to 
statistical inference. 

This text is, then, comparable to S. S. Wilks’ Hlementary Statistical 
Analysis and Dixon and Massey’s Introduction to Statistical Analysis. 
For the instructor who feels justified in the complete abandonment of the 
median, mode, quartiles, percentiles, ogives, graphic analysis other than the 
histogram, significant digits and rounding, sources of data, index numbers 
and time series, the text is an adequate one, provided the teacher guides 
the student carefully past a few of the difficulties mentioned below. Except 
for these, the explanations are fairly consistently lucid and the development 
well organized. Especially in the first sections of the book (on permutations 
and combinations, the histogram, and linear interpolation), both the prob- 
lems and the answers play an excellent and essential role in the exposition. 
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(Odd-numbered problems are answered in an appendix of 34 pages.) Although 
the author maintains some of the format of a mathematics text (theorems 
and proofs), the presentation is essentially non-mathematical—within the 
grasp of the student who has not had college algebra. Computation tech- 
niques receive probably a basic minimum of space. 

This is a text which can reasonably be covered in а one-semester course in 
statistics. Professor Clark does not attempt to cover completely any more 
than sample means, sample proportions, and differences between sample 
means and sample proportions. All of this is presented in very great detail 
in the first 154 pages. The order of presentation of these concepts is as fol- 
lows: introduction to statistical inference (4 pages); permutations and combi- 
nations (8 pages); probability (30 pages); frequency and probability distri- 
butions (48 pages); the reliability of sample means and probabilities (48 
pages); the significance of the difference between two sample means or per- 
centages (15 pages). After this very detailed development, the author men- 
tions analysis of variance (14 pages) and chi-square (16 pages) primarily, 
it seems, to emphasize the limitations of the techniques already developed. 
Even with this limited objective, an adequate reason why the variance be- 
tween samples is comparable to the variance within samples could have 
been presented. Assumptions are not mentioned in these 14 pages on the 
analysis of variance. 

Here are a few more specific points which impressed this reviewer and in 
which the prospettive user of the text might be interested. 

In his chapter on probability, the author defines three types of probability: 
empirical probability, a priori probability, and statistical probability, The 
definitions are rather carefully set forth and adhered to throughout the text. 
This reviewer prefers to emphasize one definition of probability (somewhat 
akin to what Professor Clark calls “statistical probability”) and to use the 
adjectives “empirical” and “a priori” simply to distinguish between different 
methods of approximating probability, The author’s definition of empiricial 
probability necessitates referring to sample proportions as sample prob- 
abilities. Probably his only failure to use the term “sample probability” 
consistently is in the heading of Chapter 6. The definition of statistical prob- 
ability given on page 19 involves the concept of confidence, which might 
preferably be attached to estimates of probability rather than to the prðb- 
ability itself. 2” 

In the text proper (as distinguished from the questions and answers) 
Professor Clark does not introduce the term “null hypothesis" nor the word 
"hypothesis" until the chapter on inferences from chi-square.. Instead, he 
limits himself to the development of two-limit-and one-limit confidence 
intervals. Unfortunatel ; the application of confidence intervals to the 
problem of the testing of hypotheses in the case of the normal approximation 
to the binomial, where the sample proportion is directly substituted into the 
formula for the standard error of a sample proportion, can introduce a serious 
error. For example, the illustration in Section 8.2 is incorrect not only for the 
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reason stated by the author but because the standard error of a sample 
proportion is computed incorrectly. Instead of a t of 2.3 the use of 1/6 instead 
of 1/12 would have given a t of 1.7. This correction could make a difference 
in the conclusions. 

On page 236, in his answers to problems 11, 13, 15 and 17, the author 
handles this same type of problem more adequately. In fact, problem 17 is 
essentially the same problem (as in Section 8.2) but analyzed properly. The 
answers to these four problems are basically an introduction to the testing 
of hypotheses without any of the usual terminology. À more complete 
presentation (including some exposition of the error of the second kind) in 
the text proper would be preferable. 

In the reading and problems on differences between sample means and 
sample proportions Professor Clark equates the following: ^... we can say 
with 99% confidence that the first universe has a greater mean than the 
second universe” (page 145) and “we found that with 99% confidence we 
can say that the two universes involved have different means” (page 149). 
The author is essentially working with only the former concept because of 
his restriction to confidence intervals. In this same chapter, failure to work 
with one-limit confidence intervals yields lower confidence levels than are 
actually applicable. Thus while theorem 6.5 is correct in saying that “we 
can say with c% confidence that the mean of the first universe is greater 
than the mean of the second universe,” the ‘last paragraph in Section 6.6 
on page 150 is incorrect in its interpretation that “with confidence greater 
than 99.7% we make no inference by theorem 6.5.” Actually the confidence 
can be as high as c% plus one-half of (100 minus с)% and should be so stated 
in theorem 6.5. This would make it analogous to theorem 5.17.2 which also 
is а one-limit confidence interval. Theorems 6.8, 6.12.1, and 6.12.2 should be 
comparably adjusted. 

Professor Clark does not give enough space to distinguishing between 
his confidence statements about samples, which dominate the exposition, 
and his confidence statements about populations. From the presenta- 
tion given, it seems unlikely that the student will perceive that the confidence 
statements about population parameters must have different numerical 
limits for each statement (in accordance with the sample results obtained) 
for the given confidence to work out. 

In the nature of more minor criticisms are the following. In the para- 
graph in the center of page 96, the word “bias” is used in reference to random 
sampling error. At the end of this same paragraph there is an implication 
that stratified samples are not random samples. Problem 5 on page 89 should 
not be introduced without & continuity correction. The correct answer to 
two decimal places is .17 which is also obtained with the proper continuity 
correction. The book’s answer (obtained without the continuity correction) 
is .10. 

Very few of the trivial errors that tend to appear in the first printing of a 
new text were noticed, On page 162, т„=4531 should be m, —4731. On page 
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164, 2(12,500 — 47,777)? should be 2(12,500 —4,777)*. On page 233, the answer 
to the fifth part of problem 7 under Section 3.3 should be 50 instead of .50. 
On page 224, the answer to the third part of problem 1 is incorrect, One 
part of the answer is also left out for this same problem. (In reference to 
all these problems it would help considerably for class- and home-work if the 
parts of problems were identified by letter.) A final minor error noticed: the 
description set forth under the histogram on page 228 does not agree with 
the histogram itself. 


Advanced Statistical Methods in Biometric Research. C. R. Rao (Professor of 
Statistics, Indian Statistical Institute, Calcutta). New York: John Wiley & 
Sons, Inc.; London, Chapman & Hall, Limited, 1952. Pp. xvii, 390. $7.50. 


Rosupiru Srrereaves, Stanford University 


que author's object is “to present a number of statistical techniques, 
keeping in view the requirements of both the student who questions the 
basis of a particular method employed and the practical worker who seeks 
a recipe for the reduction of his data,” In keeping with this purpose, the first 
two chapters are devoted to mathematical theory, the first to the algebra 
of vectors and matrices and the second to probability distributions. The 
next five chapters deal with methods of estimation and tests of hypotheses. 
The last two chapters are concerned with statistical methods in problems of 
classification. By and large, the author assumes that the reader is familiar 
with the fundamentals of probability theory and univariate statistical 
inference, 

The book presents a wide variety of useful statistical techniques. Chapter 
3 treats linear estimation, tests of linear hypotheses, combination of weighted 
observations, tests of hypotheses with a single degree of freedom, analysis of 
variance, theory of statistical regression, and the problem of least squares 
with two sets of parameters, Chapter 4 is devoted to the general problem of 
estimation with discussions of minimal variance estimates, maximum likeli- 
hood estimates, and sufficient statistics, Chapter 5 deals with large sample 
tests of statistical hypotheses, particularly tests based on statistics with а 
limiting normal distribution, or а limiting chi-square distribution under the 
null hypothesis. Tests of homogeneity of variances and correlation coeffi- 
cients are given in Chapter 6. Chapter 7 discusses tests of significance in 
multivariate analysis. Two types of tests are presented, namely, tests based 
on discriminant functions where the multivariate problem is reduced to a 
univariate problem by considering linear compounds of the original variables, 
and tests based on Wilk’s lambda criterion for problems representing multi- 
variate extensions of univariate analysis of variance. 

Chapters 8 and 9 include, in addition to the classical problem of classifi- 
cation, discussions of the resolution of a mixed series into two Gaussian 
components, the allocation of a number of individuals to two or more 
groups, the problems of optimum selection, and the problem of classifying 
different groups of individuals to form a significant pattern. 
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Throughout the book, numerical examples, drawn largely from anthro- 
pology, geneties, and general biology, are worked out in detail to illustrate 
the computational procedures involved. In addition, a number of exercises 
and problems are provided for the more mathematically minded reader. A 
list of references is given at the end of each chapter. Although the book is 
addressed to biometric workers, a number of important biometric problems 
are not mentioned. These include problems of probit analysis and the 
general design of experiments, among others. 

The tests given in Chapters 3-7 are generally presented from the classical 
viewpoint of testing a null hypothesis against unspecified alternatives, The 
notion of the power of а test and other concepts of the Neyman-Pearson 
theory are not introduced until Chapter 8, preliminary to the discussion of 
multiple classification problems. In the treatment of the latter problems, 
use is made of the concept of losses associated with various wrong decisions, 
and optimum decision procedures are obtained for a priori probabilities of 
the different alternatives. Rao differentiates sharply between tests of null 
hypotheses and problems of multiple classification and appears to doubt 
the notion that all problems of statistical inference can be given a general 
formulation. He feels that although various attempts have been made to 
build up a general theory, it is difficult to argue whether or not such a theory 
exists. These views are surprising in the light of the recent researches of 
Wald and others. i 

The book should prove a valuable source of statistical knowledge for work- 
ers in both theoretical and applied statistics. It is likely that more insight 
into the problems considered in this book could be gained if they were 


. treated more generally in the unified framework of statistical decision theory. | 


However, this task is perhaps best left for the future, since the development 
of the theory of decision functions has thus far outstripped its application. 


Statistical Methods for Chemical Experimentation. W. L. Gore. New York: 
Interscience Publishers, Ing., 1952. Рр. vii, 210. $3.50 


See the article by C. Daniel, pp. 476-85, in this issue. 


Econometrics. Gerhard Tininer. New York: John Wiley & Sons, 1952. Pp. xiii, 
870. 


Danie В. Surrs, National Bureau о) Economic Research 


Te testing of economic hypotheses and the measurement of theoretically 
meaningful economic relations is a science in which direct experiment is 
virtually impossible. Empirical economic research must rely on data derived 
from such facts as the actual operations of the economy happen to generate, 
and can test and measure only in terms of such situations as have arisen. 
The result is rather special limitations on the research techniques which 
can be employed, and method appropriate to any given problem, like gold, 
is where you find it. д 
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The quantity and quality of tools available to the would-be prospector in 
this field have greatly increased in the last two decades or so. Thus, although 
Tintner has provided a nontechnical introduction to econometrics as Part I 
of his book, it is his primary purpose to collect and present a wide selection 
of these newer methods for the economic researcher. 

The technical portion of the book begins with Part II, “An Introduction 
to Multivariate Analysis” which includes such items as discriminant analy- 
sis, canonical correlations, the’ treatment of errors in the variables and 
certain problems of identification. Part III, “Some Topics in Time Series 
Analysis,” includes discussions of trends and seasonal adjustments, auto- 
correlation and stochastic processes, and transformations of time series 
data. Except for some of the illustrative examples there is no claim to 
originality, The topics are well chosen and the discussion is accompanied 
throughout by ample reference to sources and collateral material. 

Tintner has done good service in collecting these scattered techniques, but 
the over-all organization of his presentation is not well thought out from 
the point of view of the usefulness of the volume as an aid to the economic 
researcher, 

The organization is based upon statistical topics rather than research 
problems. The result is that special cases of generally related research prob- 
lems are given widely separated treatment, often as coordinate topics, 
with inadequate attention to the relationships among them. Indeed some 
of the cross references are more confusing than helpful and point away from, 
rather than toward, the nature of the relationships. For example, the 
problem of identification as discussed in Chapter 7 (pp. 154-84) gives the 
reader an impression of complete generality, although in fact the conditions 
given there, that a particular member of a system of stochastic equations 
be identified, do not apply to recursive systems, where the problem of identi- 
fication does not arise. However, when the latter are discussed under the 
heading “Stochastic Difference Equations and Process Analysis" (Section 
10.3.7, pp. 275-277), the reader is referred back ta Chapter 7. Moreover the 
reader who attempts to apply the conditions of Chapter 7 to the four equa- 
tion recursive system of Section 10.3.7 will find that, while two of the equa- 
tions of the system appear to be under-identified by this test, no mention or 
justification is given the seeming contradiction. Again, a diagonal recursive 
system, under the heading “Systems of Stochastic Difference Equations” 
(Section 10.3.4, pp. 267-69) is treated as if it were essentially different from, 
rather than a special case of, a triangular system. This impression is rein- 
forced by the forward reference concluding the section: ^ . . . [The illustra- 
tive example given] . . . can evidently be considered only as purely descrip- 
tive, and the individual equations cannot be identified with meaningful 
economie relationships. ... A method which is based on the idea of process 
analysis, where the individual equations have definite economic meaning, 
will be presented in section 10.3.7” (p. 269). The clearly-suggested difference 
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in method is, of course, mistaken. The real difference in Tintner's treatment 
of the two topics lies in his selection of illustration, 

The organization of the book has another unfortunate by-product in that 
it gives rise to no occasion on which to discuss certain cases which do not fit 
the categories used. The problem of identification as treated in Chapter 7 
precedes the time series discussion, hence no lagged variables are included 
in the treatment. In the section on time series, however, the treatment is 
limited to complete recursive systems—i.e., those in which there are no 
exogenous variables. Moreover, in neither case is there a discussion of the 
role of exact relations among the variables—e.g., definitions—although 
such a discussion is certainly in order. Without the advantage of a more 
general discussion, the reader who is faced with a recursive system which 
also contains exogenous variables, or a nonrecursive system with lagged 
variables, or with a system containing a definitional relation is left in the . 
dark, 

A few well placed pages devoted to the nature of stochastic systems in 
general and the various special cases frequently encountered would have 
contributed both greater coherence and wider applicability to the work. 

Quite apart from the question of organization, the presentation frequently 
leaves the reader with little understanding of the kind of research problems 
to which a given topic might be applicable. He must rely largely on the 
illustrative examples for guidance. Some ‘of these leave nothing to be 
desired. Indeed, Tintner’s own application of the method of weighted re- 
gressions to test the hypothesis that the demand and supply of British 
labor are homogeneous functions of order zero represents econometrics at its 
best. The economic theory at issue is discussed, the problem is formulated, 
the test is made and critically evaluated. 

On the other hand, the examples are in some cases obscure and are fre- 
quently so removed from a research context as to be meaningless. This is 
particularly notable in Part III where the same time series are used over 
and over, subjected without regard to aptness to whatever manipulation 
the current topic may require. Thus the American meat consumption series 
is involved in stochastic systems to illustrate the just-identified equation 
(pp. 168-71), and the over-identified equation (pp. 177-84). It is fitted with 
a cubic trend (pp. 195-98), and is subjected to Fourier analysis (pp. 220-27). 
Its correlogram is analyzed to test for hidden periodicities (pp. 225-27), 
to test whether the series might be represented as a moving average of a 
stochastic variable (pp. 290-92), and to test whether a second order differ- 
ence equation is satisfied (pp. 298-99). It is again analyzed as a difference 
equation of second order (pp. 262-63), and of third order (pp. 267), and as & 
difference equation with errors in the variables (pp. 274-75). It also illustrates 
the variate difference method (pp. 320-23). In addition this series is used in 
regression equations with other variables on a number of occasions. 

This is all done without regard to the purpose of research or the meaning- 
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fulness of any particular application. If it does not confuse, it certainly 
does not help the prospecting researcher, who may well wonder, for example, 
what to make of the conclusion that “it is not impossible that [the deviation 
from а cubic trend ой the American meat consumption series follows a 
stochastic process of the type of moving averages” (p. 293). 

Even the more or less mechanical aspects of presentation leave a .reat 
deal to be desired. The exposition is sometimes so abbreviated that it would 
be difficult for a reader not already familiar with the material to see the 
point, In Section 10.3 (pp. 269-72) for example, the reader is plunged into a 
discussion of distributed lags as treated by Roos in his Factors Influencing 
Residential Building, without having been told explicitly what distributed 
lags are. Moreover the rationale of Roos’ fairly complicated economic model 
is left obscure to the point that not all the variables in the system are defined. 
Again, the description of the use of orthogonal polynomials in trend fitting 
(pp. 190-98) is carried out without a clear explanation of what they are. 
And, although more than ten pages are devoted to the exposition and appli- 
cation of a method for obtaining consistent estimates of the parameters of 
a single over-identified equation (pp. 172-84), no motivation is given for 
the manipulations, nor is the question of why over-identification is a problem 
ever raised. 

In spots the text is carelessly worded. Thus, the necessary and sufficient 
conditions that a given equatión in a stochastic system be just identified 
(p. 167) are first scrambled together in a non sequitur before being straight- 
ened out in the following paragraph. Finally, the number of misprints, some 
occurring in functions and equations, is astonishing. 

The unfortunate conclusion is that Tintner’s book will best serve those 
already reasonably familiar with econometrics and econometric method 


who want а catalog to the literature. It is not a reliable and useful guide to 
the prospecting economic researcher. 


The Theory of Linear Estimation. M. V. Jambunathen. Bangalore, India: India 
Book Company, 1951. Pp. vi, 84. Rs. 3/-. 


Wittram G. Маро, University of Illinois 


А FAR аз it goes, this is а neatlittle book, but it suffers from а major 
defect. It omits material that is needed today by any person who wishes 
to use the subject matter of the book. 

Before discussing the lacks of the book let us briefly outline its contents. 
It is in two parts. Part I (pp. 1-44) deals with linear estimation. As here 
defined, this is the estimation of a linear function of parameters of which the 
expected values of independent random variables are linear functions. The 
best unbiased estimate is obtained and its variance is derived. Simple 
algebraic tools are used. Part II (pp. 47-83) is entitled “Testing of Hypothe- 
sis.” After making the usual normality hypotheses, the usual tests of signifi- 
cance are obtained. Again, it is concisely done. 
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The book does not purport to present new results but only to be a “succinct 
account” of its subject matter. In omitting any discussion of the power 
function and decision theory, the usefulness of the book has been greatly 
reduced. Tang’s fundamental paper is not even mentioned, let alone the 
later work of Wald and others. These omissions are serious since the student 
who reads the book may be led to feel either that no further work has been: 
published, or that it is unimportant, or that it has no value in practice. Yet 
recent research, in the analysis of variance as elsewhere, has immediate and 
important practical applications. 

To summarize: What the book does, it does well. But it does not include 
results that are of immense importance. Since no real applieations are pre- 
sented, the omissions cannot be justified by any lack of necessity of the 
omitted work in view of the particular applications made in the book. 


Introduction to the Theory of Games. J. С. C. McKinsey. New York: McGraw- 
Hill Book Company, 1952. Pp. 371. 


Irwin Bross, Cornell University Medical College 


Sue three hundred years ago a French gambler happened to ask a mathe- 
matician about the odds in a dice game that was popular at that time, 
Out of this innocent query was to grow the subject of mathematical prob- 
ability and, in direct line of descent, the topic of mathematical statistics, 
There are still vestiges of the game heritage in modern statistical practice. . 
Dice and card games are often used as examples in courses in statistics. 
Occasionally dice or numbers in a hat are used to randomize an experiment, 
and an important modern technique is called the “Monte Carlo Method.” 

Insofar as games of pure chance are concerned, such as honest dice or 
roulette, the mathematical analysis has been quite successful. The mathe- 
matician’s advice concerning odds has been tested by gamblers over a period 
of many years and has been found to be sound. The record of mathematical 
analysis in more complex games such as poker, chess, or bridge has not been 
too successful. For example, it is well known that a person who plays poker 
strictly according to the published tables of odds will lose to a good poker 
player. 

The failure of the earlier mathematical analyses of games such as poker 
was due to the omission of an important element—strategy. In poker or chess 
or bridge the player has personal choice and can derive benefit from an in- 
telligent line of play. 

The first comprehensive attack on the problems of games of strategy was 
made by J. von Neumann in 1928. As the title Theory of Games and Economic 
Behavior would indicate, von Neumann’s pioneer book (with Morgenstern) 
was intended to apply not merely to parlor games, but also to economic 
situations. 

J. von Neumann’s original work has been extended and elaborated into 
a new sub-field of mathematics called “game theory.” A very excellent ac- 
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count, which is concerned “almost entirely (with) the purely mathematical 
aspects of the theory,” is provided by McKinsey’s Introduction to the Theory 
of Games. The first chapters of the book (Chapters 1 through 8) are about 
at the level of a B.A. in mathematics. Considerably more advanced mathe- 
matics are utilized in the latter chapters. Clearly written and well planned, 
the book provides a very readable discussion of game theory (including more 
recent advances), 

Statisticians might be surprised to learn that Chapter 13 of McKinsey’s 
book is titled “Applications to Statistical Inference.” This raises the ques- 
tion: Is game theory of importance to statisticians; that is, will many statis- 
ticians benefit by learning about game theory? 

My answer to this question is in the negative, although I enjoyed person- 
ally reading this book and I would recommend it strongly to anyone who 
wished to learn about game theory. 

My objections to game theory do not concern the mathematics, but 
rather the basic ideas of this field. The essentially new step taken by game 
theory was to bring strategy into the mathematical picture. Thus if two 
players A and B were engaged in a game, the analysis would have to consider 
their respective strategies. Now evidently A’s strategy is going to depend on 
B's, and B's strategy will in turn depend on A's, so this gets into а merry- 
go-round, What is worse, all sorts of psychological considerations enter the 
picture, for A’s strategy depends on what A thinks B’s strategy will be, and 
so on. This psychological interaction is, of course, the heart of any game of 
strategy—it makes games fun. 

The approach chosen by von Neumann provides a very elegant mathe- 
matical formulation for game theory and gets rid of the messy psychological 
issues. However, the procedure comes perilously close to “throwing out the 
baby with the bath water.” What is done, basically, is to replace the two 
players by two computing machines or robots, and to give these robots 
special instructions. The gist of these instructions is: “Maximize your 
minimum expected gain.” 2 

The robot game described above would seem to be an appropriate model 
for intellectual games with high caliber opponents. On a chess game, for 
example, a player might very well maximize his minimum expected gain. 
"Thus player À would consider his available moves and for each move try to 
envisage the best countermove by player B (ie. A's minimum expected 
gain). Player A would appropriately select as his own move the one which 
gave the most advantage even against the best defense. 

On the other hand, for many parlor games (and real life games) the robot 
model is of dubious utility. In actual games, player А should study player 
B's style во as to take advantage of B's mistakes, In game theory a player 
would use the same Strategy against dub or expert. 

The association between game theory and statistics arises from the fol- 
lowing analogy. Suppose now that player А is а statistician. Suppose also 
that player B is “nature.” The statistician makes the first “move” in this 
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game by doing an experiment. Nature's answering “move” is a set of data. 
The statistician’s next “move” is to examine the data and make some decision 
(i.e., reject a shipment of parts). If the statistician makes the wrong decision 
he pays a penalty (which may depend on the extent of his “еггог”). 

McKinsey recognizes that “nature cannot properly be conceived as trying 
to outwit us,” but suggests that “the player may be interested in determining 
what is the worst nature can do to him.” McKinsey then asserts, “Situations 
of this sort arise particularly in connection with statistics.” To buttress this 
statement McKinsey gives three examples, one of which is “to maximize the 
accuracy of the determination of a quantity for a given cost.” These exam- 
ples serve only to refute McKinsey's assertion, for they are all “pure maxi- 
mization problems in the classical sense" where McKinsey himself admits 
"there is no question of countering the moves of another rational creature." 

After this unpromising start, McKinsey proceeds to give a very simplified 
version of a problem in public opinion sampling. “А certain urn is known 
to contain two balls, each of which is either black or white. A statistician, 8, 
wishes to make a guess as to how many balls are black.” Suppose that if S 
guesses right he receives $100. If he misses by one he receives nothing. If 
he misses by two he must pay out $100. 8 may inspect one ball (or both), 
but each inspection costs him $50. 

It should be noted that this is not a problem in statistical inference. If one 
ball is inspected (ї.е., the sample is taken), there is no attempt to use the 
the sample to make inferences about the population. Indeed, the example is 
such that one ball provides no information about the other ball. What has 
happened is that the statistical problem has been “simplified” out of exist- 
ence. 

Be this as it may, it is instructive to consider the game theory solution 
to the problem. The statistician is advised to behave as follows: he tosses & 
coin, If the coin shows heads, he announces that one ball is white and one 
is black without taking a sample. If the coin shows tails he examines one ball 
and guesses that both balls are of the same color as the one tested. 

The reason why the game theorist’s advice to the statistician is so queer 
is not hard to discern, The game theorist says in effect: “Don’t look at data 
or past experience (i.e., don’t try to learn nature’s strategy); consider the 
worst that nature might do instead of what nature is likely to do.” 

This advice hardly makes sense unless the statistician believes that the 
world is, quite literally, against him. $ 


A Theory of Psychological Scaling. Clyde H. Coombs. University of Michigan 
Engineering Research Institute Bulletin No. 34. Ann Arbor: University of Mich- 


igan Press, 1952. Pp. vi, 94. $1.75. Paper. 
Bert F. Green, Massachusetts Institute of Technology 
Те scaling problem considered in this monograph is that of accounting 


for the observed interrelationships among a set of qualitative variables 
by relating them to one or more hypothetical “underlying” variables. Pro- 
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fessor Coombs presents а general conceptual rationale for many of the 
available psychological scaling methods. 

The monograph begins with a general discussion of the theory of measure- 
ment. The logical properties of various types of scales are discussed with 
special emphasis on scales based on partial orderings of the objects being 
scaled. Professor Coombs introduces his approach to the scaling problem 
by suggesting two systems of parameters. “The genotypic system refers 
to an inferred, hypothetical, latent, underlying basis of behavior. The pheno- 
typic [system] is the manifest, the observed level of behavior.” The scaling 
problem is “to study the information contained in a set of phenotypic obser- 
vations to determine what can be inferred about the genotypic level.” 
Two genotypic variables are defined. Ом; is the measure of a stimulus, j, 
on some attribute for an individual, $, at the moment, h. См; is the measure . 
of an individual, 7, on some attribute of a stimulus, j, at the moment h. 
The phenotypic variable, which is a psychological magnitude subject to 
observation, is defined as Рь;=@ы;— Сы; A set of postulates is provided 
that relates these variables to some of the typical rating procedures used 
in sealing techniques. For example, for fixed л and $, the judgment “Stimulus 
jis preferred to stimulus В? is represented аз | Ры) < | Pi | while the judgment 
“Stimulus j has more of (some attribute) than Ё is represented by Prise Ры 

Next, the genotypic and phenotypic parameters are defined. These 
parameters are the conceptual components of variance of the Q's, C's, and 
P's. For example, the parameters based on Q, for each item j, are the variance 
of the Q's within individuals, i.e., replications, and the variance between 
individuals, The latter is further divided conceptually into the variance 
accounted for by controlled factors, and the residual variance between indi- 
viduals. Analogous definitions are given for parameters based on the C's and 
the P's. The data obtained in a scaling experiment are to be classified 
according to which variance components are zero. After the general theory 
has been presented, it is related to two specific scaling methods, Coombs’ 
unfolding technique and the method of paired comparisons. 

The theory is not presented in an attempt to unify the field of scaling. Its 
purpose is to give a sound logical basis for certain types of scales—especially 
scales concerned with ordinal relationships. For example, Guttman’s scalo- 
gram technique may be encompassed by the theory, whereas Lazarsfeld’s 
latent structure analysis cannot be treated adequately. Professor Coombs 
voices a prejudice against stochastic models for scaling. He believes that the 
use of statistical concepts in scaling models “is to build an actuarial science 
at the possible cost of a science of individual behavior.” This reviewer feels 
that many of the factors influencing attitudes and preferences are at present 
uncontrollable ; any attempt to make a detailed study of the individual case 
in the face of a large error variance seems optimistic. However, the ultimate 
worth of a general theory should be judged by its utility in consolidating & 
number of special techniques, and in Suggesting new areas for investigation. 
In this regard, it may be noted that the monograph is to some extent a status 
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report, since it contains many references to theoretical and empirical studies 
now in progress. 

Professor Coombs' monograph is written for а special professional audi- 
ence. However, even specialists in the field will find that the monograph is 
not easy reading. This is due in part to an excess of indigenous jargon. Sec- 
ondly, concrete examples are used sparsely. It would have been extremely 
helpful to have more specific instances of the conceptual theory. The ex- 
amples provided in the last two chapters are helpful, but not sufficient. 

Despite these shortcomings, Professor Coombs' work is an important 
contribution to the theory of psychological scaling. Workers in the field of 
scaling will find many interesting and stimulating ideas in this short mono- 
graph, 


Effective Management through Probability Controls: How to Calculate Man- 
agerial Risks. Robert Kirk Mueller (Assistant General Manager, Mosanto 
Chemical Company, Plastics Division). New York: Funk & Wagnalls Company 
in association with Modern Industry Magazine, 1950. Pp. xvi, 310. $5.00. 


Two Reviews follow: 


J. С. Влім, Associated Merchandising Corporation 


TE aims of this book are: First, to cite enough examples to prove the 
case that statistical control utilizing the law of probability is really a 
technique for modern management; second, to show that this technique can 
be comprehended by personnel at the executive level who may not have & 
background of mathematics, statistics or technical training; and third, to 
show that the opportunities for applying this management tool are not 
confined to manufacturing alone, 

The author attempts to achieve these aims in five sections: I. How to make 
the most of the significant—introductory in character; II. Why executives are 
interested in statistical probability—an account of benefits to executives and 
case histories; ІП. Brass hat facts about statistical control—operational 
benefits and a once-over-the-field of statistics very lightly, especially statisti- 
cal quality control; IV. “But my business is different"—applications other 
than to manufacturing and misapplications; V. Topside responsibility and 
participation in a control program—organizing a program. 

Among the good features of the book are its illustrations of charts and 
demonstration equipment, many of which are quite apposite, and its numer- 
ous references to mathematical history, recalling a feature often omitted 
entirely from an education in mathematics. 

An extremely long list of examples of statistical quality control in various 
companies provides suggestions for the idea hunter. Perhaps the most valu- 
able feature is a detailed documentation of the installation of a system in 
the Monsanto Company which, intentionally or unintentionally, brings home 
the fact that it is much more of an undertaking than many executives imag- 
ine. 
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The fundamental thesis of the book is one with which few will quarrel, viz. 
statistical quality control is a good thing. This might serve as a summary of 
the book in one sentence. It is doubtful if it requires three hundred and ten 
pages to put it across to the reader. The book succeeds in the first and third 
of its declared aims but in respect of the second the term “comprehend” must 
be taken in a wholly superficial way. 

Against these virtues must be reckoned deficiencies which are numerous 
and obvious. 

The title is a patent misnomer. It is a matter of extreme doubt whether 
statistical quality control as here outlined is what the august body referred 
to as “management” will identify as a typical managerial risk. Managements 
notoriously identify themselves far more with the problems requiring experi- 
mental design, for instance. One will not be able, after reading the book, to 
calculate a managerial risk or any other kind of risk for the reason that one 
is never told how to do it. 

The statistical content is quite trifling, being purely descriptive. A descrip- 
tion of “factorial experiments,” for instance, is illustrative. It requires one 
sentence: “The solution of the problem is a technical one concerned with 
chi-square values, analysis of variance, degrees of freedom, interaction resid- 
uals, and many other of the mathematical aspects of such work.” 

Here the superficiality to be attached to “comprehend” in the second aim 
of the writer is apparent. About the most the executive who reads the book 
could hope for is to recognize an occasional term if he ever went to a statisti- 
eal meeting. Of course, “Addition, subtraction, long division and an occa- 
sional square root thrown in is about all that is needed for an executive to 
become reasonably familiar with statistieal techniques." 

А conventional view of the role of the executive as one ^who may super- 
impose his basic judgment” must seem, to some readers, to be strangely at 
variance with “the scientific approach to management (which) is replacing 
management by Indian-medicine-man methods.” 

One is repeatedly warned against the evils of having on one’s staff 8 
“mathematician or statistician who is interested in mathematics only for 
mathematics’ sake.” Surely the advice is gratuitious, but one can fairly hear 
the mind of a certain type of reader clicking out an ugly conclusion. At the 
same time “It is also advisable to have someone on the staff qualified to han- 
dle the more mathematical aspects of the latest statistical techniques.” 

The numerous examples might well have been greatly reduced in number. 
The consequent demands of brevity beget obscurity, oftener than not. 

The most serious criticisms are two in number: Is the type of appeal effec- 
tive? and Is statistics put in a proper light? 

With respect to the first, there is a deplorable implication that the proper 
way to sell the idea to the management is by cajolery and flattery, although 
there are probably some who will buy, and that a smattering of ignorance is 
all one needs to run a quality control installation. In addition, it is fact of 
experience that managements do not regard promised savings as a commen- 
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dation of their own past performances and are more attracted by promises of 
production and freedom from trouble. 

With respect to the second, the executive lingers indefinitely, lacking sharp 
distinctions: (1) between statistics (singular) and statistics (plural); and (2) 
between statistics as a technique applied to a type of phenomenon and statis- 
tics as a technique applied to masses of numerical information. The complete 
and ill-founded assurance which most executives feel about their knowledge 
of the second element in each of these distinctions will certainly becloud 
their perception of the first. 

One is conscious of an intensification of the need for a short, clear and pre- 
cise work on statistics for executives but the talent which produces such things 
is rare. One cannot feel that the advice given in this book is either as good or 
as articulate as “Hire a good statistician and relax.” 


PauL 8. OrnusrEAp, Bell Telephone Laboratories 


T wide variety of applications of statistical quality control discussed 
in this book may be of interest to some statisticians. The book also con- 
tains a number of charts and photographs that illustrate how particular 
features of SQC may be presented convincingly to management. 

Unfortunately, the book gives ample evidence that the author is confused 
&bout the present status of SQC. This is in part apparent from the titles of 
the five sections in which the book is divided: I How to make-the most of the 
significant; II Why executives are interested in statistical probability; III 
Brass hat facts about statistical control; IV “But my business is different”; 
V Topside responsibility and participation in a control program, The author 
seems afraid to use the term, Statistical Quality Control, that has been ac- 
cepted so generally by management. Instead, he makes an unconvincing 
attempt to bring in a new term, Probability Control. This has no real mean- 
ing for an executive whose primary interest is the best quality for the least 
cost. * 

"This book is not recommended reading either for the busy executive or for 
the beginner in SQC. 


Factors Affecting the Demand for Consumer Installment Credit. Avram Kissel- 
goff. New York: National Bureau of Economic Research. 


Скокав E. O'ROURKE 


Kom paper should command the attention of the professional 
economist and statistician, both because it deals with a timely subject, 
and because it is an example of the application of the econometric method 
to a specific problem. We would suggest that its methodological significance 
outweighs the importance of the conclusions drawn. 

The statistical analysis of the demand for instalment credit casts into 
sharp focus the difficulties which beset one who attempts to utilize a method 
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which combines both the theoretical and the statistical approach in treating 
economic data. The author made use of two statistical techniques in esti- 
mating the parameters of the structural equation which determines the 
demand for instalment credit. The first of these considers the equation as a 
part of a general system, and estimates the parameters from reduced form 
equations, This method leaves the estimates free of bias, but complicates the 
computational difficulties, restricts the number of variables which can be 
handled, and raises the identification problem. As an alternative, Kisselgoff 
estimated the parameters by the use of multiple correlation, and ignored the 
fact that the equation for the determination of instalment credit is a part of 
a much larger and more complex system. 

The author feels that the bias introduced by the use of the single equation 
approach may not be significant for the type of study he is undertaking (p. 
29). This reviewer is inclined to agree with him, While the bias is there, it is 
certainly insignificant when compared to the inexactitudes caused by the 
crudities of the data, the exclusion of a large number of important factors 
because of the limited number of observations, and the undeniable fact that 
the structure itself changes over time, and in a way which can only roughly 
be accounted for in a trend term. Kisselgoff’s restriction of the period of 
study to the pre-war era is brought about by a recognition of these short- 
comings. 

The need to rely in large part on the historical and institutional approach 
in economic investigations cannot be eliminated by the wholesale adoption 
of the econometric approach. The author gives implicit recognition to this 
fact when he refers to the stimulating effect of the veterans’ bonus on instal- 
ment credit demand, and the restrictive influences of the introduction of 
Regulation W (p. 42). 

One could perhaps suggest that the study should have included other ex- 
planatory variables, and should have been extended into the post-war period. 
However, these shortcomings, if such they be, may be attributed to the un- 
availability of data, and to the amount of personal judgment involved in 
selecting the variables to be considered as relevant. 

The results of Kisselgoff's analysis, although not startling, are at least 
reassuring to those who on a priori grounds would have indicated a relevance 
for those factors which he finds significant, The relative importance of cur- 
rent income as a determinant of the demand for instalment credit is not sur- 
prising, particularly when we consider the high income elasticity of consumer 
durables. The high negative elasticity of demand with respect to the size of 
the required monthly payment is perhaps more revealing, and might prove 
of Some interest to those who are responsible for monetary policy. The least 
satisfactory feature of Kisselgoff's work is his method of accounting for the 
liquid asset effect in a number of models through the level of income lagged 
one year. This reviewer would prefer to see the liquidity element handled 
separately, and more explicitly. 


However, as it stands this paper is well worth the time required for a 


ү 


TS лсзҺңгггингкардыы а АЁ 805. 


BOOK REVIEWS } 663 


careful reading, and should serve to point out what can be accomplished, and 
what cannot be accomplished in this difficult field. 


Agricultural Policy of the United States. Harold G. Halcrow. New York: Prentice- 
Hall Inc., 1953. Pp. vi, 458. 


Ivan M. Lre, University of California (Berkeley) 


р material in this book is presented in three parts. Part I is given over 
to а discussion of what the author calls the agricultural setting. Population 
trends and future prospects, trends and relationships in selected subaggre- 
gates of agricultural production, and the behavior of aggregate agricultural 
income over time are summarized briefly in this part. Most of the remainder 
of Part I is devoted to a diagramatie and elementary numerically illustrated 
discussion of selected economic concepts such as supply, demand, elasticity, 
and costs. Part II is very brief, containing the author’s version of a useful 
classification of the objectives of agricultural policy under the headings: 
(1) increasing efficiency, (2) raising and stabilizing farm income, and (3) im- 
proving social welfare. Part III occupies about one-half of the book, In this 
part a wide range of government legislation affecting farmers is discussed 
under appropriately chosen chapter headings. 

In a review of a book which is offered as a textbook in agricultural policy, 
it would seem appropriate to pay some attention to the question of what 


‘consitutes the field of agricultural policy. The author, in his opening remarks 


in Part III (p. 208), suggests lines along which agricultural policy as a field 
of study might, in the opinion of this reviewer, be fruitfully developed: 
“... The student of policy must become a student of economics, sociology, 
and political science, as well as several other subjects if he wishes to obtain a 
broad understanding of the field.” Having recognized the broader aspects of 
the field, the author chooses to restrict his analysis to the much narrower 
point of view of economics. He states (p. 208): “Our emphasis is on eco- 
nomics. Our problem 18 to recognize the economic and political forces at 
work and to bring economic analysis to bear on the problems under discus- 
sion, . .. We cannot consider in one book the implications for policy of all 
various disciplines such as philosophy, political science, and sociology. . . . 
We shall talk about political interests and pressure groups. But we 
shall place our major emphasis on the economic analysis of the programs that 
are formulated to carry out the objectives of policy.” The author’s develop- 
ment is in the main consistent with this stated intention. The tools outlined 
in Part I are selected from those to which the beginning student is subjected 
in an elementary course in economic principles. The objectives of policy in 
Part II are in the main phrased in language which facilitates discussion in 
terms of economic logic. Finally, the analysis of programs in Part III is de- 
veloped primarily along economic lines. 

The author’s interest in narrowing his subject to more manageable pro- 
portions is understandable. On the other hand, the main support for the recog- 
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nition of agricultural policy as a separate field of study would seem to come 
from the desirability of bringing various ideas and techniques from several 
related fields to bear on the subject under analysis. А textbook in the field 
would appear a most appropriate place to develop this kind of an integrated 
approach. 

Considered from the narrower viewpoint of economie analysis, several re- 
marks seem pertinent. First, the level of analysis is quite elementary. In the 
preface the author indicates that the book is designed: (1) to serve readers 
with little previous exposure to economie theory, and (2) to serve as а basis 
for the development of more advanced courses in policy. With respect to the 
first objective, this reviewer is inclined to question whether an economie 
analysis of agricultural policy can be effectively handled at this elementary 
level. А very minimum prerequisite of one course in economie principles 
would seem essential. With this background several chapters included in Part 
іп the present form could be omitted. With regard to the second objective, 
one cannot escape the conclusion that this text would need to be heavily sup- 
plemented in an advanced course. 

А second point concerns the absence of sufficient recognition of some of the 
limitations of the policy researcher’s analytical tools. The elementary stu- 
dent in particular upon reading this book is likely to carry away the impression 
that the analytical tools are a good deal sharper than is in fact the case. This 
applies from both the economic and statistical points of view, but attention 
here is given to the latter. Economic concepts are quantitative concepts. 
A significant element in the analysis of various agricultural programs in- 
volves the estimation of relevant quantitative economic relations. The statis- 
tical theory of estimation of economic relations is admittedly rather involved 
and it would seem inappropriate to suggest that an attempt should have 
been made to treat it systematically in an elementary text in agricultural 
policy. On the other hand, the presentation of a number of estimates of co- 
efficients of demand elasticity as is done in Chapter 6 of this book with no 
caution regarding their tentative character seems equally inappropriate. If, 
such material is to be presented at all in an elementary book of this nature, 
there would seem also to be some obligation to include an elementary exposi- 
tion of the relevant statistical problems of estimation. A defendable alterna- 
tivein the present case would have been to omit this material since the book 
is not in the main quantitatively oriented. At the more advanced level a 
strong case can be made for an integrated treatment of econometric method, 
including the more recent development, in a textbook in agricultural policy. 

Another respect in which statistical methodology deserves some attention 

-is in connection with errors in the basic data commonly used in quantitative 
research in agricultural policy. Data on prices, production, employment, 
etc., used extensively by policy researchers, and appearing mainly in chart 
form in the present book, are estimates based on methods which leave а 
cloud of uncertainty regarding the errors of estimation. Those responsible 
for these estimates recognize their fallibility although no measures of error 
are provided as a guide to the user. The methods of estimation employed 
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depart in important respects from those dictated by sound statistical theory. 
It is not suggested here that the writer of а textbook in agricultural policy 
should be charged with the responsibility of developing measures of error. He 
is, however, under some obligation to recognize the presence of errors as an 
element in even the simplest type of quantitative analysis of the agricultural 
setting and of various agricultural programs. À common quantitative tech- 
nique in agricultural policy involves, for example, comparisons of certain 
aggregates or averages in different segments of the economy or different areas 
within agriculture. Such comparisons might serve quite appropriately to 
suggest frictions and maladjustments in the functioning of the system. In- 
comes per worker in agricultural and nonagricultural employment may serve 
ав an example of a commonly employed comparison. Both income and em- 
ployment estimates are subject to statistical errors of estimation. In addition, 
particularly in connection with agricultural employment, conceptual or defi- 
nitional differences account for substantial discrepancies in current estimates 
available (BAE and Bureau of the Census). When various sources of error 
are taken into account, one wonders just how substantial a difference must 
be before it takes on genuine quantitative significance, In the case of in- 
comes per worker significant differences may well remain after allowance 
for statistical and conceptional errors. Other comparisons could be cited 
where this may or may not be the case. The point is raised not to question 
the aggregative comparative technique as a device for suggestive analysis 
but rather to suggest that in the development of agricultural policy as a field 
of study, proper attention to the statistical point of view would seem a con- 
structive innovation. 


Causes of Decline in the World’s Cotton Textile Trade. Osaka, Japan: Institute 
for Economic Research, Toyo Spinning Co., Ltd., 1952. Pp. 48. 


Karı A. Fox, Bureau of Agricultural Economics 


E brief study was pfepared for the All Japan Cotton Spinner's Associ- 
ation in connection with discussions at the International Cotton Confer- 
ence held in 1952, The foreword states that "The manuscript in Japanese has 
had the examination and approval of the members of the Japanese delega- 
tion to the Conference." 

This is primarily an economic analysis. The statistical methods used are 
simple, and the terminology in some passages is more pretentious than de- 
seriptive. For example, a table dividing the total volume of world trade іп 
cotton textiles into imports and exports by each of two major groups of 
countries is described as an “input-and-output relation table." A tabular 
comparison of changes in rayon and cotton consumption is also said to em- 
ploy “the input-and-output analysis." The only resemblance to input- 
output analysis is that the sums of row and column totals in the tables are 
both equal to the element in the lower right hand corner. 

In some cases it is not clear what countries and time periods are used in a 
given analysis, nor even what method of analysis is used. On page 31, correla- 
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tion coefficients, income elasticities, and price elasticities of demand for cot- 
ton textiles are reported with no indication of the number of observations 
underlying each, or of their standard errors or levels of significance. Price 
elasticities are reported for Japan for the periods 1920-24, 1925-31 and 1920- 
81 as a whole. The price elasticity of cotton textile consumption in 1920-24 
is given as —0.58 and in 1925-31 as —0.69. The income elasticity in 1920— 
24 is reported as 0.43 while that in 1925-31 is reported as 1.27. If these re- 
sults were based (as seems evident) upon multiple regression analyses, the 
analysis for the first period left two degrees of freedom while that for the 
second period left four. The apparent drastic change in income elasticities 
between the two periods is probably not significant. Yet on page 33, the fol- 
lowing inferences are drawn from these analyses: “In the first half of the peri- 
od in question when income was generally low, price was the more important 
factor, whereas in the latter half of the period when the average income has 
increased, income elasticity was greater than price elasticity, from which we 
learn that income rather than price was the more dominant factor.” 

The elasticity coefficients, of course, do not show which factor was “more 
dominant” during the period in question; furthermore, we would ordinarily 
expect the income elasticity of demand for textiles (in terms of yards of 
cloth consumed) to be smaller at high than at low income levels. An analysis 
of cotton textile consumption in Indonesia during the years 1931-38 is also 
reported and yields a price elasticity of —0.63 and an income elasticity of 
+0.52. Again no standard errors are shown. On page 34 the following infer- 
ence is drawn: “As is readily discernible from these estimates, demand in 
the under-developed countries is more apt to be affected by price fluctuations 
than by changes in national income.” 

The study as а whole gives a specious appearance of carefulness and ob- 
jectivity. It is essentially an economic brief pleading a special cause, and ad- 
vancing proposals which would be of primary benefit to the cotton textile 
industry of Japan. While it contains more statistics, and perhaps a more 
teasonable interpretation of them, than is commun in economic briefs pre- 
pared in advocacy, upon closer examination the statistical analysis is found 
to be extremely weak and the inferences drawn from it largely unwarranted. 
Tt would be interesting to see what a competent and objective analyst, or & 


team consisting of a foreign trade specialist and a statistician, could do with 
the same basic material. 


A rao Scale for Measuring Farm Family Level of Living: A Modification of 
Sewell’s Socio-Economic Scale. John C. Belcher and Етті F. Sharp. Stillwater, 
Oklahoma: Oklahoma Agricultural Experiment Station, 1952. Pp. 22. 


Fren L. бткортвеск, University of Chicago 


Re concerned with the determination of the socio-economic status 
са rural, or non-rural, families will wish to examine this revision of 
Bewell's 1940 scale.! The authors have meticulously correlated the presence 
MEO EU rue ае ае аннан n a Se N rl ae 


1Bewell, William H., The Construction and Standardisation of a Scale for the Measurement of the 
Socio-Economic Status of Oklahoma Farm Families, Stillwater, Oklahoma, April, 1940. 
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and absence of 29 characteristics in a sample of 825 open-country Oklahoma 
families. The characteristics examined had previously been found to be the 
most consistently discriminative from the set of 123 items used by Sewell. 
The distinctive contribution of the present study is a factor analysis which 
reveals that Sewell’s list contained an economic cluster and a cluster of items 
pertaining to religious participation. The writers accordingly select the items 
from the economic cluster, rework the weights of the alternatives, and then 
present a brief, easily administered, “level of living” scale. The ten items of 
the short scale treat construction of house, plumbing, lighting and refrigera- 
tion facilities and similar matters. 

From a more general point of view, this study represents a reversal of 
current trends in the sociological analysis of “class” phenomena. Most recent 
efforts have been directed toward creating an easily administered but factori- 
ally complex scale which would maximally reproduce judgments of par- 
ticipants in the community or similar criteria. From the standpoint of broad 
demographic investigation (such as the origins of high level talent or the inci- 
dence of schizophrenia) we are very greatly in need of a status classification 
for agricultural populations which would articulate with those we use in 
urban analysis, Insofar as the present study throws into sharp relief the 
possibility that factorially pure scales with unambiguous items may neces- 
sarily be highly specific in the cultural traits they involve, we are forewarned 
that we may be led further from some of the engineering objectives we seek 
to attain by socio-economic indices if we insist on single factor sub-scales at 
this time. 

Within the limited scope the authors worked, itis to be regretted that no 
systematic investigation was made of the minimum number of items which 
would essentially reproduce their scale. A practical administrator might also 
wish to know the items which are most or least sensitive to level of living 
fluctuations of the type associated with droughts or new farm parity pro- 
grams. It is my feeling that the items they have used would lag well behind 
decreases in income. * 


The Labor Force in California: A Study of Characteristics in Labor Force, Em- 
ployment and Occupations in California, 1900-1950. Davis McEntire. Berkeley: 
University of California Press, 1952. Pp. x, 101. $2.50. Paper. 


Graprs L. Parmer, University of Pennsylvania 


ни Institute of Industrial Relations of the Berkeley branch of the Uni- 
versity of California has broadened the base of its studies of wages and 
collective bargaining problems in a recent study of changes in the labor 
force of the state of California from 1900 to 1950. Professor McEntire’s 
analysis of a half century of changes in the labor force and structure of 
employment in California provides a background for the understanding of 
many labor market problems in “the most dynamic state segment of the 
national labor force,” 
The first chapters discuss changes in population and labor-force participa- 
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tion rates and their effects on the composition and size of the California labor 
force. Later chapters outline long-term trends in the occupational and 
industrial distribution of employment and the impact of the war and defense 
production programs. Racial differentials in labor-force rates and employ- 
ment attachments are also considered. 

These data provide a skeleton structure for research in many labor mar- 
ket problems which ean most appropriately be studied in California. For 
example, а major force in the growth of California's population and labor 
force from 1900 to 1950 has been migration. Although the extent of immigra- 
tion is known to have varied from decade to decade, it would be valuable to 
know how much its character has changed. The extent to which or the rapid- 
ity with which migrants take on the labor force characteristics of workers in 
places of destination as against places of origin might be more readily studied 
here than elsewhere, because of a relatively large volume of net migration, 
Other hypotheses about the propensity to migration on the part of workers 
in different occupational groups or the influence of unemployement or of 
wage differentials need testing. 

A visitor to California is impressed by the combinations and permutations 
by which California families earn a living. They appear to be more varied 
in this state than in others. If true, this variety may stem from the seasonal 
character of some industries and consequent irregularity of employment or 
indeterminateness in the trends of the state’s economy, as well as other 
forces. But it is not yet clear whether California is on its way to becoming 
an industrialized state or whether its future manufacturing development may 
be limited. A study now in process may answer the latter question but it is 
hoped that the Institute may see its way clear to include some of the prob- 
lems noted in its future research program. 


The Pattern of Age at Marriage in the United States, Vols. I and II. Thomas P. 
Monahan, Philadelphia: Stephenson-Brothers, 1951. Pp. vi, 451. $4.00. 


Pau. Н. JAconsox, Metropolitan Life Insurance Company 


Te statistical study—the author's doctoral dissertation in sociology at the 
University of Pennsylvania—is the first in many years which is devoted 
exclusively to marriage in the United States. The principal objective was {0 
determine the long-term trend in age at marriage, and at the same time to 
throw some light on correlated factors such as oceupation, education, nation- 
ality, race, residence, and the law. Toward this end, Dr. Monahan has 
drawn liberally on hundreds of sources, including publications of the Bureau 
of the Census, as well as contributing his own sample tabulations of New 
Jersey marriage records dating back to 1848. 

The author is extremely critical of past studies on age at marriage and 
believes that no conclusions on the long-term trend can be reached from 
available data. To support this thesis, he presents what appears to be inter- 
nally inconsistent evidence. In this reviewer's opinion, however, the “incon- 
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sistency" is due largely to variations in the degree to which the data are re- 
fined in different parts of the book. Thus, in the latter part of the first volume, 
when allowance is properly made for the changing age composition of our 
population, he does find indications of a decline in age at marriage since the 
turn of the century. However, these findings, although in conformity with 
prevailing opinion, cannot be accepted as conclusive evidence of the long- 
term trend, since the data are limited to widely separated periods of time 
for only a few individual states—evidence which is hardly representative of 
the secular trend for the country. 

The author would have done better to approximate the annual age specific 
marriage rates for the country by assembling all data available for the 
period studied. With such estimates it would have been possible to trace the 
total experience of a generation and then to draw conclusions regarding the 
trend in age at marriage. Dr. Monahan dismisses as inadequate the “census” 
method of determining the median age at first marriage (derived from popu- 
lation statistics for all persons ever married), yet the “census” method tends 
to approximate what the generation method would have shown for the period 
since 1890. 

With the population data for the proportions ever married arranged on a 
generation basis, it would be hard to refute the hypothesis, which the author 
refuses to accept, that the age at marriage rose during and immediately 
after the Civil War and that it did not begin to decline again until just before 
the turn of the century. In other words, the trend appears to have been re- 
versed when persons born around 1875 reached the usual age for marriage. 
With proper evaluation and organization of his material, Dr. Monahan would 
have been in a better position to confirm or contradict this hypothesis, long 
current among many researchers. 

No doubt, the literature is “honeycombed with misstatements of fact 
and dubious results,” but in this reviewer’s opinion the author has not done 
much to clarify the situation. The reader will not find the trend in age at 
marriage, nor even of total marriages, in this book. The absence of an index 
and a very sketchy table of contents also detract from its value. Neverthe- 
less, if the book stimulates action to remedy the deficiencies of available 
data on this important subject, Dr. Monahan will have niade a lasting contri- 
bution, His bibliography, covering 100 pages, is the most comprehensive pub- 
lished to date, and should prove of interest and value to other investigators 
in this field. 


Design for a Brain. W. Ross Ashby. New York: John Wiley and Sons, Inc., 1952. 
Pp. ix, 260. $6.00. 
A. S. HOUSEHOLDER, Oak Ridge National Laboratory 
«Т nore to show that a system can be both mechanistic in nature and yet 
I produce behavior that is adaptive.” This is the goal the author sets for 
himself on page 1. The principle upon which such a system operates, however, 
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is “a principle hitherto little used in machines." The system is called “multi- 
stable," and it consists of many subsystems called “ultrastable.” 

Before developing the notions of ultrastability and multistability, the 
author discusses the meaning of stability and equilibrium; defines adaptive 
behavior as that which “maintains the essential variables within physiologi- 
cal limits”; distinguishes variables (which define the state of the system) 
from parameters (which describe the situation in which it is placed); and 
introduces a number of special terms. In particular an absolute system is 
defined, and the definition is shown to be equivalent to the condition that the 
behavior of the system is governed by a system of ordinary differential 
equations in which time does not appear explicitly. Also functions are classi- 
fied as step-functions, part-functions (with finite intervals of constancy), 
full functions (continuous and having no interval of constancy), and null- 
functions (everywhere constant). 

Now an ultrastable system is defined as “one that is absolute and contains 
step-functions in а sufficiently large number for us to be able to ignore the 
finiteness of the number.” Consider the organism, for the moment, as an 
ultrastable system, subject to some set of external conditions. The system 
may be in a stable equilibrium with its variables undergoing no change; or 
the variables may be changing but within limits; or, finally, at least one 
of the variables may be approaching a level that would be injurious to the 
organism. In an ultrastable system one may expect that before that level 
is reached the system will encounter a “critical state,” at which a step-func- 
tion changes value. It is then as though a new set of differential equations 
takes over, the kinetic properties of the system undergo a sudden change, 
and we have, in effect, a different system upon our hands. In the system thus 
altered, it may happen that now the system approaches a steady state with 
the variables confined between physiological limits. But if not, then a new 
critical point may be reached, at which there occurs another step-function 
change. If eventually, after one or more such changes, the system reaches a 
steady state before a variable actually reaches a level the organism is unable 
to tolerate, then the organism has successfully adapted to its present environ- 
ment. If not it succumbs, or undergoes injury in some degree. 

Though the author speaks of critical points, perhaps it would be better 

. to: speak of critical regions. The topology is nowhere described explicitly, ' 
but the diagrams seem to indicate a simply connected finite region, no point 
of which is critical, but outside which every point is critical. One infers also 
that the critical region consists of critical subregions, possibly overlapping, 
each subregion being critical for a particular step-function. 

If there are n step-functions, all independent, in the sense that by knowing 
the values of n—1 of these we cannot infer the value of the nth, then there are 
2" possibilities even if each step-function has only two possible values. If 
itisa matter of pure chance which and how many step-functions change val- 
ues at any time, and if stability is achieved in only one or a small number of 
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these possible cases, then the animal would probably be dead before it hits 
upon a favorable combination when it is subjected to conditions that evoke 
such changes. Partly to evade this difficulty, partly to account for the fact 
that learning and adaptation generally progress by degrees, the author intro- 
duces the notion of multistable systems, consisting of many ultrastable 
systems. Each ultrastable system contains only а small number of part- 
functions, and it can seek its own equilibrium, in some measure independent- 
ly of the other ultrastable systems which make up the organism as a whole. 
The independence is achieved by linking the ultrastable systems with part- 
functions. Thus if subsystems S; and S have only the variable 2 in common, 
and if z is а part-function, then the systems are essentially independent when 
æ is constant. 

The argument is verbal and qualitative throughout. An appendix serves to 
give mathematical clarification to some of the notions and to develop a few 
auxiliary theorems, without purporting to constitute a formal demonstration 
of the theses. As a verbal development, it is lucid and persuasive. Numerous 
quotations from the literature in psychology, physiology, protozoology, 
etc. suggest the presence of step-functions and part-functions, and of ultra- 
stability, and otherwise illustrate the author’s argument. 

In principle it should be easy to construct an ultrastable system, and possi- 
bly also а multistable system “іп the metal," and the endeavor is to be recom- 
mended to those interested in robotology. The author, in fact, describes а 
system of the former type actually in existence, and states that a multistable 
system is under construction. These should be interesting to observe and 
might indeed exhibit many of the characteristics of living beings. 

The author, of course, promises only to show that а mechanical system 
can exhibit adaptive behavior. The basic question is, therefore, а very diffi- 
eult probabilistic question. Suppose one has constructed, in metal or on paper, 
an ultrastable or a multistable system. Consider, in probabilistic terms, the 
situations to which it might have to adapt or succumb. We can then ask what 
chance it has of surviving, or rather, what is its life expectancy? Еуеп though 
it may be capable of adapting to any given situation, in the sense that there 
exists an appropriate set of values for its several step functions, what are 
the chances that one such set will be “discovered” before one of the variables 
exceeds physiological limits? 
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HIS is the report of a committee appointed by the Commission on 
Statistical Standards of the American Statistical Association to 
review the statistical methods used in Sexual Behavior in the Human 
Male. We shall refer both to the book and to its authors (Kinsey, 
Pomeroy and Martin) as КРМ. Тһе committee wishes to emphasize 
that this report is confined to statistical methodology, and does not 
concern itself with the appropriateness or the limitations of orgasm 
аз а measure of sexual behavior. The treatment of specific problems 
has necessitated an examination of some of the statistical and method- 
ological problems of such studies, and the organization of frames of 
reference in which the Statistical methods can be discussed. The com- 
mittee hopes that both detailed and general considerations will be of 
Service to Dr. Alfred C. Kinsey and his co-workers; to the National 
Research Council's Committee for Research on Problems of Sex, who 
requested the appointment of this committee; and to others facing 
similar statistical or methodological problems. 
We have endeavored to write this report in a way that would mini- 
mize the possibility of misunderstanding. To do this, it is necessary to 
* This article consists of the main text, but not the appendices, of the report of a committee ap- 
Pointed in 1950 by S. S. Wilks as President of the American Statistical Association, to review the sta- 
tistical methods used by Alfred C. Kinsey, Wardell В. Pomeroy, and Clyde E. Martin in their Sezual 
Behavior in the Human Male (Philadelphia, W. B. Saunders Co., 1948). For further details on the ap- 
ointment of the committee and its charge, see Section 1, p. 676 below. For an outline of the appendices, 
as well as of this paper, see Section 3, pp. 678-81. Appendix G, “Principles of Sampling,” will appear 


as an article in the March issue of this JoURNAL. The full report, including both the text given here 
and the appendices, will be published as a monograph by the American Statistical Association in 1954, 
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deal with many detailed aspects of the work, one at а time. By judicious 
selection of topies and attitudes, % would have been possible to write 
two factually correct reports, one of which would leave the impression 
with the reader that KPM's work was of the highest quality, the other 
that the work was of poor quality and that the major issues were 
evaded. We have not written either of these extreme reports. 

Even within the present report, а reader who is trying only to sup- 
port his own opinions could select sections and topics to buttress either 
view. In the details of this report the reader will find numerous prob- 
lems that we feel KPM handled admirably. If he pays attention only 
to these, he would find support for the opinion that the work is nearly 
impeccable and that the conclusions must be subtantially correct. 
There are other problems which we believe KPM failed to handle ade- 
quately, in some cases because they did not devote the necessary skill 
and resources to the problems, in other cases because no solutions for 
the problems exist at present. The reader who concentrates only on the 
parts of our report in which such problems are discussed would find 
support for the opinion that KPM's work is of poor quality. 

Our own opinion is that KPM are engaged in a complex program of 
research involving many problems of measurement and sampling, for 
some of which there appear at the present to be no satisfactory solu- 
tions. While much remains to be done, our overall impression of their 
work to date is favorable. 

Many details are discussed in the body and appendices of this report. 
Тһе main conclusions are as follows: 

1. The statistical and methodological aspects of KPM's work are 
outstanding in comparison with other leading sex studies. In & com- 
parison with nine other leading sex studies (four supported in part 
by the same NRC Committee) КРМ were superior to all others in 
the systematic coverage of their material, in the number of items which 
they covered, in the composition of their sample as regards its age; 
educational, religious, rural-urban, occupational, and geographic repre- 
sentation, in the number and variety of methodological checks which 
they employed, and in their statistical analyses. So far as we can judge 
from our present knowledge, or from the critical evaluations of a num- 
ber of other qualified specialists, their interviewing was of the best. 

| 2. KPM’s interpretations were based in part on tabulated and statis- 
tically analyzed data, and in part on data and experience which were 
not presented because of their nature or because of the limitations of 
space. Some interpretations appear not to have been based on either 
of these. We feel that unsubstantiated assertions are not in themselves 
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inappropriate in a scientific study. The accumulated insight of an 
experienced worker frequently merits recording when no documenta- 
tion can be given. However, KPM should have indicated which of their 
statements were undocumented or undocumentable and should have 
been more cautious in boldly drawing highly precise conclusions from 
their limited sample. 

3. Many of КРМ? findings are subject to question because of a 
possible bias in the constitution of the sample. This is not a criticism 
of their work (although it is a criticism of some of their interpretations). 
No previous sex study of a broad human population known to us, medi- 
cal, psychiatric, psychological, or sociological, has been able to avoid 
this difficulty, and we believe that KPM could not have avoided the 
use of a nonprobability sample at the start of their work. Something 
may now perhaps be done to study and reduce this possible bias, by a 
probability sampling program. 

In our opinion, no sex study of a broad human population can expect 
to present incidence data for reported behavior that are known to be 
correct to within a few percentage points. Even with the best available 
sampling techniques, there will be a certain percentage of the popula- 
tion who refuse to give histories. If the percentage of refusals is 10 
per cent or more, then however large the sample, there are no statistical 
principles which guarantee that the results are correct to within 2 or 
3 per cent. The results may actually be correct to within 2 or 3 per cent, 
but any claim that this is true must be based on the undocumented 
opinion that the behavior of those who refuse to be interviewed is not 
very different from that of those who are interviewed, These comments, 
which are not a criticism of KPM’s research, emphasize the difficulty 
of answering the question: “How accurate are the results?”, which is 
naturally of great interest to any user of the results of a sex study. 

4. Many of KPM's findings are subject to question because of possi- 
ble inaccuracies of memory and report, as are all studies of intimate 
human behavior among broad segments of the population. No one has 
proposed any way to remove the dangers of recall (involving both 
memory and report) and KPM were superior to the nine studies re- 
ferred to above in their attempts to control and measure these dangers. 
We have suggested still further expansions of their methodological 
checks, 

Until new methods are found, we believe that no sex study of inci- 
dence or frequency in large human populations can hope to measure 
anything but reported behavior. It may be possible to obtain observed 
or recorded behavior for certain special groups, but no suggestions have 
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been made by КРМ, the critics, or this committee which would make 
it feasible to study observed or recorded behavior for a large human 
population. These remarks are intended as à comment on the present 
status of research techniques in sex studies and not as a criticism of 
KPM’s work. 

5. КРМ received only limited statistical help, in part because the 
work was pursued during the War years when such expert help was 
difficult to find for non-military projects. In view of the limited statis- 
tical knowledge which was available to them, as made clear by the 
failure of their sample size experiment, KPM deserve much credit 
for the straight thinking which brought them safely by many pitfalls. 
Their need of adequate statistical assistance continues to be serious. 
Substantial assistance might come through the development of a 
statistical clinic at Indiana University, or through the addition of a 
statistical expert to KPM's own staff. Unfortunately the sort of assist- 
ance which might resolve some of their most complex problems would 
require understanding, background, and techniques that perhaps not 
more than twenty statisticians in the world possess. 

6. A probability sampling program should be seriously considered 
by KPM. The actual gains from an extensive program are limited, to 
an extent unknown at present, by refusal rates and indirectly by costs, 
particularly by the costs of maintaining the present quality of the indi- 
vidual histories by KPM’s approach. A step-by-step-program, starting 
with a very small pilot study, is recommended. 

7. In addition to proposing a probability sampling program, we 
have made numerous suggestions in this report for the modification 
and strengthening of KPM’s present approach. The suggestions in- 
clude expanded methodological checks of their sampling program, 8 
further study of their refusal rate, some modification of their methods 
of analyses, further comparisons of reported vs. observed behavior, 
and stricter interpretations of their data. We have been informed by 
KPM that many of these improvements, including some expansion 
of their techniques for obtaining data, have already been incorporated 
in the volume dealing with sexual behavior in the human female. 


CHAPTER I. BACKGROUND AND ORGANIZATION 
1. Organization involved 


This committee, consisting of William С. Cochran, Chairman, 
Frederick Mosteller, and John W. Tukey, was appointed by President 
8. 8. Wilks in September 1950 as a committee of the Commission on 


STATISTICAL PROBLEMS OF THE KINSEY REPORT 677 


Statistical Standards of the American Statistical Association. This 
action was initiated by а request from the Committee for Research 
on Problems of Sex of the National Research Council, as indicated by 
the following excerpt from a letter dated May 5, 1950, from Dr. George 
W. Corner, a member of the NRC Committee, to Dr. Isador Lubin, 
Chairman of the Commission on Statistical Standards of the American 
Statistical Association. 

"In accordance with our telephone conversation of yesterday, I am writing 
to state to you the desire of the Committee for Research in Problems of Sex, 
of the National Research Council, that the Commission on Standards of the 
American Statistical Association will provide counsel regarding the research 
methods of the Institute for Sex Research of Indiana University, led by 
Dr. Alfred C. Kinsey. 

"This Committee has been the major source of financial support of Dr. 
Kinsey's work, and at its annual meeting on April 27, 1950, again renewed 
the expression of its confidence in the importance and quality of the work 
by voting а very substantial grant for the next year. 

"Recognizing however that there has been some questioning, in recently 
published articles, of the validity of the statistical analysis of the results 
of this investigation, the Committee, as well as Dr. Kinsey's group, is 
anxious to secure helpful evaluation and advice in order that the second 
volume of the report, now in preparation, may secure unquestioned ac- 
ceptance." 


Some correspondence ensued, in which Wilks indicated the willing- 
ness of the American Statistical Association to provide counsel as 
requested. 

Kinsey, in a letter to Wilks dated August 28, stated that 

“уе should make it clear that we deeply appreciate the willingness of the 
American Statistical Association to undertake such an examination of our 
statistical methods, (hat we will give it full cooperation in having access 
to all of our data as far as the peculiar confidential nature of our data will 
allow, and that we understand, of course, that the committee shall be free to 
publish its findings of whatever sort.” 


In the same letter, Kinsey also made a number of suggestions about 
the constitution and work of the committee, to the effect that the 
persons on the committee should be primarily statisticians with experi- 
ence in human population studies, that they should plan to review 
the statistical criticisms which have been published about the book on 
the male, and that they should compare methods used by Kinsey and 
his associates in their research with methods in other published research 
in similar fields. 

With respect to the research on the human female, Kinsey wrote as 
follows: 


678 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


“Jt should, however, be made clear that all the data that will go into our 
volume on Sezual Behavior in the Human Female are already gathered, that 
the punch cards have already been set up and most of them punched, and 
that statistical work is proceeding on that volume now. While the recom- 
mendations of the committee may modify further work, it can affect this 
forthcoming volume only in the form in which the material is presented, the 
limitations of the conclusions, and the careful description of the limitations 
of our method and conclusions." 


2. Committee procedure 


Although no specifie written directive was issued to the committee, 
the letter quoted earlier from Corner to Lubin sets forth the task as- 
signed to the committee. In one respect the scope was deliberately 
reduced as compared with that envisaged in the letter. The committee 
decided not to undertake any examination of the researches and data 
relating to the human female, in order to avoid disruption of Kinsey’s 
proposed schedule of work. 

In October, 1950, the committee spent five days at the Institute for 
Sex Research of Indiana University, accompanied by Mr. Robert 
Osborn as assistant. Subsequent meetings of the committee were held 
at Chicago (December 1950), Princeton (January 1951), Cambridge 
(May 1951), Baltimore (July 1951) and Princeton (October 1951). 

In their review of previous studies of sexual behavior, the committee 
received major assistance from Dr. W. O. Jenkins, who prepared a 
series of reports which appear in Appendix B. Mr. A. Kimball Romney 
prepared a helpful index of the principal criticisms made of the statisti- 


Du methodology used in the book Sexual Behavior in the Human 
e. 


3. Structure of this report as a whole 


KPM's program of research is а major undertaking, involving more 
than ten years’ work. Any discussion of it which aims at thoroughness 
must itself be lengthy. In order to keep the main body of our report 
down to a reasonable length, we have relegated much of the documen- 
tation of our conclusions, and all detailed discussion, to the following 
series of appendices. 


. Discussion of comments by selected technical reviewers. 
. Comparison with other studies. 

. Proposed further work. 

. Probability sampling considerations. 

. The interview and the office as we saw them. 

Desirable accuracies. 

. Principles of sampling. 


alagoas 
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Appendix A contains our discussion of the statistical and quantita- \ 
tive methodological content of six of the critical reviews which ap- 
peared after the publication of the KPM book. These six were chosen 
from among the large number of published reviews, because they 
concentrated their attention on the statistical aspects of the research. 
Appendix А also includes, where this seems appropriate, discussion of 
some critical points which were not explicitly raised in the reviews in 
question. 

Appendix B, by W. O. Jenkins, contains a review of the statistical 
aspects of eight of the major previous sex studies which have been car- 
ried out in the United States. Also included are similar reviews of the 
KPM book and of one more recent study by J. E. Farris. The purpose 
of this appendix is to provide a basis for comparing the KPM study 
with the other studies as to comprehensiveness, sampling methods, 
interviewing methods and statistical analysis. 

Appendix C begins by outlining and commenting on suggestions for 
further work made by the reviewers. It explains the difficulty of esti- 
mating the stability of results from а sampling procedure such as 
KPM's, offers some possible methods for this estimation, and suggests 
how more appropriate variables for expressing sexual behavior might 
be developed, and how compound variables might be built on these. 
It then explores the problem of when to adjust, giving a simple numeri- 
cal procedure for making the decision, and concludes by summarizing 
the probability sampling suggestions derived from Appendix D. 

Appendix D discusses the problems of analysis and usefulness of 
probability sampling as a check on a nonprobability sample, particu- 
larly when refusal rates are considered; two possible types of probabil- 
ity samples and a probability sampling program which KPM might 
undertake; and the alfernative of studying restricted populations. 

Appendix E discusses the interview and the office as we saw them. 
Appendix F discusses what seems to be known about the accuracy 
needed in such work as KPM’s. Appendix G presents an account of 
the principles of sampling illustrated with general examples. 

Many of the problems faced by KPM occur in most types of soci- 
ological investigation. Some are likely to be encountered in almost any 
kind of scientific investigation. For this reason, we have thought it 
advisable to present certain of the methodological issues in rather 
general terms. 

The reader is asked to bear in mind that in general our conclusions 
are not documented in the main body of the report, but in the appen- 
dices to which references are given. 
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II-G Systematic ERRORS 


4. Structure of the main body 

In preparing the main body, we have stressed easy reference and 
have kept related matters together at the expense of fluency of arrange- 
ment and lack of repetition. Thus our main conclusions in a form in- 
tended for the general reader take 3 pages in the digest above, while 
more detailed conclusions, expressed for a more technical audience, take 
3 pages in Chapter XI. A particular subject summarized there is also 
likely to be discussed once in Chapter II, where we try to point out ` 
what KPM did, once again in one of Chapters IV to IX, where we 
assess KPM on an absolute scale, and yet again in Chapter X, where 
we compare KPM with previous workers in the field. This is repetitive, 
but we hope that it will permit ready airs and avoid treating 
subjects out of context. 

After this introductory chapter on (A CNN structure, the re- 
mainder of the main body falls into three parts: 
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(i) Chapters II and III. In the first of these, we describe, respectively, 
what choices KPM had to make and what they chose. In Chapter III 
we outline some essential principles of sampling, which seem not to 
have been clearly enough formulated or widely enough understood. 
These chapters are introductory. 

(i) Chapters IV to XI. In the first six of these, we try to compare 
KPM's work with an absolute standard. The order chosen (interview, 
sample, methodological checks, analytical techniques, complex exam- 
ples, interpretation) is that in which the problems arise in an evolving 
study such as KPM's. Chapter X compares KPM with previous works 
on the basis of Appendix B, while Chapter XI summarizes the conclu- 
sions of this part. 

(iii) Chapter XII. This discusses briefly various suggested expendi- 
tures of further effort. 
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CHAPTER II. MAJOR AREAS OF CHOICE 
5. What sort of behavior? 


The purpose of Chapter II is to record in summary form the major 
choices made by KPM. 

Certainly the choice of orgasm as the central sort of sexual behavior 
for study was a major one, leading to consequences whose statistical 
aspects will be discussed in various places, but this choice is not a mat- 
ter of general quantitative methodology, and hence falls outside the 
scope of this committee’s task. 


6. Whose behavior? 


KPM had to choose the population to which this study should apply. 
This decision does not seem to have been made clearly. From the basis 
for the ^U. 8. Corrections” (p. 105) we should infer it to be “all U. 8. 
white males.” If it were the population to which the U. S. Corrected 
sample actually applies on the average (the sampled population, see 
Section 18), it would be a rather odd white male U. S. Population. 
It would have age groups, educational status, rural-urban background, 
marital status and all their combinations according to the 1940 census, 
but it would have more members in Indiana than in any other state, 
and it would have been selected to an unknown degree for willingness 
to volunteer histories of sexual behavior. We do not regard this descrip- 
tion of the sampled population as an automatic criticism, as some crit- 
ies do. We make it here as a factual statement, noting that the careful 
and wise choice of the sampled population, although difficult, is a rela- 
tively free choice of the investigator. More discussion relevant to this 
point will be found in Chapter II-G (Appendix G). 

z Further, KPM chose to study the behavior of many (at least 163 
in tabular form) segments of this large population, feeling, apparently, 
both that comparisons among segments would be illuminating and 
that data for (clinical) application to individuals should come from a 
reasonably homogeneous segment. KPM's choice of a broad population 
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created many problems, particularly in sampling. Whether they would 
have been well advised to.confine themselves to a more restricted popu- 
lation, e.g., the state of Indiana, is debatable. For our part, we are willing 
totake their choice as given, and to discuss briefly elsewhere some alter- 
natives for further work (Chapter IX-D). 


7. Observed, recorded, or reported behavior 


KPM, interested in actual behavior, had, in principle, the choice 
of studying observed, recorded, or reported behavior. But since they 
selected a broad population and orgasm as the type of behavior, their 
only feasible choice seems to have been reported behavior. This situa- 
tion does not seem likely to change in the foreseeable future. 

The choice of reported behavior implies that the question: “Оп the 
average, how much difference is there between present reported and 
past actual behavior?" is seriously involved in any inferences about 
actual behavior which are attempted from КРМ? results. The differ- 
ence might well be large, leading to a large systematic error in measure- 
ment. However, use of observed or recorded behavior in order to avoid 
this difference does not seem to us & feasible way to measure nation- 
wide incidences and frequencies for KPM’s broad population, because 
it would have produced systematic errors in sampling possibly larger 
than the error in measurement. 


8. Interview or questionnaire, and types thereof 


Having settled on reported behavior, KPM had to decide whether 
this report should be oral or written, and what methods should be 
used to elicit it. Theirechoice was oral, in a face-to-face interview 
whose flavor was designed to be that of a doctor or family friend. 
The choice of oral rather than written report: 


(1) made it possible to obtain apparently satisfactory answers from 
many more subjects (the percentage of complete illiteracy in the 
U. S. is small, but the percentage of illiteracy on complex sub- 
jects not usually written about is undoubtedly substantial). 

(2) permitted and encouraged variation of the form of the questions 
to suit the subject and the situation. у 


Those, like some critics, who believe in a repeatable measurement 
process, regardless of whether or not it measures something that is 
always relevant, find (2) bad. Those who, like KPM, feel that appro- 
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priately flexible wording improves communication and thus improves 
the quality of report despite the variability resulting from changes 
in the form of questions, find (2) good. 

Given an interview rather than a questionnaire, the remaining 
choices of KPM follow a consistent pattern. In nearly every case 
their approach resembled the clinical interview more closely than the 
psychometric test. 


9. Which subjects? 
Here there are various choices, pertaining to: 


(1) selection of individuals one at a time or in clusters. 

(2) keeping age, education, marital status, etc., segments in the sam- 
ple proportionate to those in the population or making them of 
more nearly equal size. 

(3) selecting individuals on a catch-as-catch-can basis, а partly ran- 
domized basis, or according to a probability sampling plan. 


They chose: 


(1) to select individuals in clusters. 

(2) to keep age, education, marital Status, etc., segments more nearly 
equal in the sample than in the population. 

(3) to use no detectable semblance of probability sampling ideas. 


The pros and cons will be discussed later. 


10. What methodological checks? 


There are choices as to the types of checks and the number of each 
to be made. The types of checks made by KPM, including 

(1) take-retake, 

(2) husband-wife, 

(8) duplicate recording of interview, 

(4) overall comparison of interviews, 

(5) others (see Chapter V-A) 


seem to cover all those easily thought of. Тһе numbers of checks made 
are discussed later. Duplicate recording of interviews occurred in an 
unknown, but Presumably small, number of cases, No comparisons 
from duplicate Tecordings were reported, perhaps because most oc- 
curred in connection with the training of interviewers. 


| 
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11. How analyzed and presented? 


In analyzing frequency and incidence of activity, KPM chose to 
report both raw and “U. S. Corrected" data and to make simple com- 
parisons. Just what was done in general was clearly stated, but the 
steps involved in detailed computations were not explained. No at- 
tempt was made to find helpful scales or composite variables (see 
Chapters IV-C and V-C). 

With the exception of ^U. S. Corrections," most of the analysis of 
the tabular data is confined to straightforward description. Some at- 
tention is paid to the problem of sample-population relation in the form 
of standard errors (presumably underestimated because they were 
based on the assumption of random sampling). However, this ap- 
proaches lip service, since many apparent differences are discussed 
with no attention to significance or nonsignificance. (Again we do not 
regard this as an automatic criticism, particularly since accurate indi- 
cation of significance would have been difficult—see Section A-18.) 

In analyzing cumulative activity, KPM's main tool was the accumu- 
lative incidence curve, a technique which they developed independ- 
ently. 


12. How interpreted? 
Тһе main choices concerned 


(1) extent of warning about possible differences between reported 
behavior and actual behavior, | 

(2) extent of warning about possible differences between the sam- 
pled population (see Section 18) and the entire U.S. white male 
population, 

(3) extent of warning about sampling fluctuations, 

(4) extent of verbal discussion not based on evidence presented, 

(5) certainty with which conclusions were presented. 


Under (1) the emphasis was on methodological checks in order to indi- 
cate, as far as they could, how small this difference seemed to КРМ 
to be. Under (2) there was little discussion. Under (3) the warnings 
were made early, incompletely, but not often. Under (4) the extent 
of discussion was substantial, most of it aimed at social and legal atti- 
tudes about sexual behavior, and descriptions or practices not covered 
by the tables. Under (5) the conclusions were usually presented with an 
air of solid certainty. 

In general the observations seem to have been interpreted with more 
fervor than caution, although occasional qualifications may be found. 
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CHAPTER III. PRINCIPLES OF SAMPLING 
18. Introduction 

It is difficult, if not impossible, to assess the quality of any sample 
and its analysis without comparing it with a set of principles. This is 
particularly true of KPM's works. The present chapter endeavors to 
веб down, in compact form, a few of the principles of sampling which 
are especially relevant to a consideration of KPM's sampling. Ав we 
have noted (Section 6), КРМ chose to select individuals in groups or 
clusters, to divide the population into segments and keep segment 
sizes more nearly equal in the sample than in the population, and to 
use no semblance of probability sampling ideas. The discussion in this 
chapter concentrates on these aspects of sampling. 

Many readers will, we believe, desire a more connected account of 
the principles of sampling, with examples and fuller discussion. These 
are provided in Appendix G. Any reader who finds the statements 
used in this chapter unclear, or not intuitively acceptable, is urged to 
turn to Appendix С before proceeding further. Once there, he should 
read through from the beginning, since argument and exposition there 
аге closely knit and unsuited to piecemeal references. 

Whether by biologists, sociologists, engineers, or chemists, sampling 
is often taken too lightly. In the early years of the present century, it 
Was not uncommon to measure the claws and carapaces of 1000 crabs, 
or to count the number of veins in each of 1000 leaves, and to attach 
to the results the “probable error” which would have been appropriate 
had the 1000 crabs or the 1000 leaves been drawn at random from the 
population of interest. If the population of interest were all crabs in a 
wide-spread Species, it would be obviously almost impossible to take 
a simple random sample. But this does not bar us from honestly assess- 
ing the likely range of fluctuation of the result. Much effort has been 
applied in recent years, particularly in sampling human populations, 
‘to the development of sampling plans which, simultaneously, 


© are economically feasible, 
Gi) give reasonably precise results, and 
(ii) show within themselves an honest measure of fluctuation of 
their results 


Any excuse for the practice of treating non-random samples as random 
ones 18 now entirely tenuous. Wider knowledge of the principles involved 
is needed if scientific investigations involving samples (and what such 
investigation does not involve samples?) are to be solidly based. 
Additional knowledge of techniques is not so vitally important, though 
it can lead to substantial economic gains. 
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14. Cluster sampling 


А botanist who gathered 10 oak leaves from each of 100 oak trees 
might feel that he had a fine sample of 1000, апа that, if 500 were 
infected with а certain species of parasites, he had shown that the 
percentage infection was close to 50%. If he had studied the binomial 
distribution, he might caleulate a standard error according to the usual 
formula for random samples, р+ v/pg/n, which in this case yields 
50+ 1.69 (since p —q —.5 and n —1000). In doing this һе would neglect 
three things: 


(i) probable selectivity in selecting trees (favoring large trees, per- 
haps? 

(ii) probable selectivity in choosing leaves from a selected tree (fav- 
oring well-colored or alternatively, visibly infected leaves per- 
haps and 

(iii) the necessary allowance, in the formula used to compute the 
standard error, for the fact that he had not selected his leaves 
individually. 


Most scientists are keenly aware of the analogs of (i) and (ii) in their 
own fields of work, at least as soon as they are pointed out to them. 
Far fewer seem to realize that, even if the trees were selected at ran- 
dom from “һе forest, and 10 leaves were chosen at random from each 
selected tree, (iii) must still be considered. But if, as might indeed be 
the case, each tree were either wholly infected or wholly free of infec- 
tion, then the 1000 leaves tell us no more than 100 leaves, one from 
each tree, since each group of 10 leaves will be all infected or all free 
of infection. In this event, we should take n=100 in calculating the 
standard error and find an infection rate of 50 + 5%. Such an extreme 
case of increased fluctuation due to.sampling in groups or clusters 
would be detected by almost all scientists, and is not a serious danger. 
But less extreme cases easily escape detection. 

We have just described, as one example of the reasons why the 
principles of sampling need wider understanding, an example of 
cluster sampling, where the individuals or sampling units are not 
drawn separately and independently into the sample, but are drawn 
in clusters, and have tried to make it clear that “individually at ran- 
dom” formulas do not apply. Cluster sampling is often desirable, but 
must be analyzed appropriately. KPM's sample was, in the main, 
a cluster sample, since they built up their sample from groups of people 
rather than from individuals. 
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15. Possibilities of adjustment 


Often the population is divided into segments of known relative size, 
perhaps from a census. It is sometimes thought that the best method 
of sampling is to take the same proportion from every segment, so 
that the sample sizes in the segments match the corresponding popula- 
tion sizes. Such samples do have the advantage of simplifying computa- 
tions by equalizing weights, and they sometimes lead to a reduction 
of sampling error. But modern sampling theory shows that optimum 
allocation of resources usually requires different proportions to be 
sampled from different segments, whether the purpose is to estimate 
average values over the population or to make analytical comparisons 
between results in one group of segments and those in another. 

When there are disparities in the relative sizes of segments in the 
sample as compared with the population, whether accidental or 
planned, these disparities must be taken into account when we а4- 
tempt to estimate averages over the whole population. One way in 
which this can be done is by adjustments applied to the segments. Such 
adjustments proceed as follows. Suppose that we know 


(i) the true fraction of the population in each segment, and 
(ii) the segment into which each individual in the sample falls. 


Then we can weight each individual in the sample by the ratio 


fraction of population in that segment 
fraction of sample in that segment 


(It is computationally convenient to weight each segment mean with 
the numerator of this ratio; the result is algebraically identical to that 
described above.) 

The result of adjustment is a new “sampled population”—one such 
that the relative sizes of its various segments are very nearly correct 
(according to (i) above). Since the weight is the same for all the sample 
individuals in a given segment, adjustment does nothing to redress 
any selectivity which may be present within segments. If we adjust 
in this way, we remove one source of systematic error without affecting 
other sources at all. The philosophy of such adjustments is discussed 
further in Section G-12, and it is concluded that they may generally 
be appropriately made (within the limits discussed in sections C-16 


—C-18). Their chief danger is the possible neglect of the possibilities 
that they may be 


| 
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(i) entirely too small, 
(ii) too large, 
(11) in the wrong direction, 
because of unredressed selectivity within the segments. When this pos- 
sibility exists, extreme caution in presenting the results of adjustment 
is indicated. 


16. Probability samples 
When probability samples are used, inferences to the population 
can be based entirely on statistical principles rather than subject- 
matter judgment. Moreover, the reliability of the inferences can be 
judged quantitatively. A probability sample is one in which 
(i) each individual (or primary unit) in the sampled population has 
a known probability of entering the sample, 
(ii) the sample is chosen by a process involving one or more steps of 
automatic randomization consistent with these probabilities, 
and Х 
(iii) in the analysis of the sample, weights appropriate to the proba- 
bilities (i) are used. 


Contrary to some opinions, it is not necessary, and in fact usually not 
advisable in а pure probability sample for 
(i) allsamples to be equally probable, or 
(ii) the appearance of one individual in the sample to be unrelated 
to the appearance of another. 


In practice, because some respondents cannot be found or are unco- 
operative, we usually obtain, at best, approximate probability samples 
(see Sections A-2 and D-13) and have approximate confidence in our 
inference. 


17. Nenprobability samples 

Samples which are not even approximately probability samples 
vary widely in both actual and apparent trustworthiness. Their trust- 
worthiness usually inereases as they are insulated more and more 
thoroughly from selective factors which might be related to the quanti- 
ties being studied. Insulation may be obtained by: 

(i) adjustments applied to the segment means in the sample, 
(ii) examination of the sample as drawn for signs of ‘selection ona 
particular factor, 
(iii) partial randomization. 
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Adjustment for segments, as explained in Section 15 above, corrects 
for any selective factor operation between segments, but corrects not 
at all for selective factors operating within segments. If adjustment is 
to be used, deliberate selectivity between segments may be exercised 
without danger, so long as it does not imply selectivity within segments. 

Negative results when the sample is examined for signs for selection 
on a partieular variable are comforting, and strengthen the reliability 
of the sample. The amount of this strengthening depends very much 
on the a priori importance of the variables checked to what is being 
studied. 

Deliberate (partial) randomization is а step toward a probability 
sample, and may be very helpful on occasion. 


18. Sampled population and target population. 


We haye found it helpful in our thinking to make a clear distinction 
between two population concepts. The target population is the popula- 
tion of interest, about which we wish to make inferences or draw 
conclusions. It is the population which we are irying to study. The 
sampled population requires а more careful definition but, speaking 
popularly, it is the population which we actually succeed in sampling. 

The notion of a sampled population can be more clearly described 
for probability sampling. In order to have probability sampling, we 
must know the chance that every sampling unit has of entering the 
sample, and the weight to be attached to the unit in the analysis. 
The sampled population may be defined as the population generated 
by repeated application of these chances and these weights. The fre- 
quency of occurrence of any particular sampling unit in the sampled 
population is proportional to the product 


(chance of entering the sample) x (weight used in analysis). 


This product is made constant for a probability sample. Thus, with 
probability sampling, the sampled population consists of all sampling 
units which have a non-zero chance of selection. 

The sampled population is an important concept because by statisti- 
cal theory we can make quantitative inferential statements, with known 
chances of error, from sample to sampled population. It must be 
carefully distinguished from the target population, the population of 
ae about which we are tempted to make similar inferential state- 
ments, 

_ Even with probability sampling, the sampled and the target popula- 
tion usually differ because of the presence of “refusals,” “not-at- 
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homes," "unable to classify," and so on. The consequence of these 
disturbances is that certain sampling units, although assigned a known 
chance of selection by the sampling plan, did not in fact have this 
chance in practice. 

With non-probability sampling, the situation is much more obscure. 
By its definition as given above, the sampled population depends on 
the existence of а sampling plan (which may be only a vague set of 
principles in the investigator's head) and on the "chances" that any 
sampling unit had of being drawn. These chances are not well known— 
if they were, we should have a probability sample. But in many cases, 
it is reasonable to behave as if these chances exist and to attempt to 
estimate them, because they provide the only means of making statis- 
tical inferences beyond the non-probability sample to a corresponding 
“sampled population.” The difficulty comes in specifying, or some- 
times even thinking about, the nature of the sampled population. It is 
certain to be a weighted population where, for example, Theodosius 
Linklater may appear 1.37 times, while Basil Svensson appears only 
0.17 times. 

Insofar as we make statistical inferences beyond the sample to a 
larger body of individuals, we make them to the sampled population. 
The step from sampled population to target population is based on 
subject-matter knowledge and skill, general information, and intuition 
—but not on statistical methodology. 


CHAPTER IV, THE INTERVIEW AREA 


19. Interview vs. questionnaire 

The committee members do not profess authoritative knowledge 
of interviewing techniques. Nevertheless, the method by which the 
data were obtained cannot be regarded as outside the scope of the 
Statistical aspects of the research. 

For what our opinion is worth, we agree with KPM that a written 
questionnaire could not have replaced the interview for the broad 
Population contemplated in this study. The questionnaire would not 
allow flexibility which seems to us necessary in the use of language, in 
varying the order of questions, in assisting the respondent, in following 
up particular topics and in dealing with persons of varying degrees of 
literacy. This is not to imply that the anonymous questionnaire is 
inherently less accurate than the interview, or that it could not, be 
used fruitfully with certain groups of respondents and certain topies. 
So far as we are aware, not enough information is available to reach & 
verdict on these points. 
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20. Interviewing technique 


Many investigators have faced the problem of attempting to obtain 
accurate information about facts which the respondent is thought to 
be unwilling to report. It is natural to inquire whether KPM, in their 
interviewing technique, took advantage of accumulated experience 
ав to the best methods for extracting the facts. But it is also well to 
inquire how much definite experience has been accumulated. 

Тһе КРМ interview impressed us as an extraordinarily skillful per- 
formance. Direct questions are put rapidly in an order which seems to 
these respondents hard to predict, so that it is difficult to tell what is 
coming next. Despite the air of briskness, we did not receive the im- 
pression that we were being hurried if we wished to reflect before re- 
plying, and supplementary questions or information were given if this 
seemed helpful to the memory. The coded recording of the data was 
done unobtrusively by the interviewer, so that the interview appeared 
1o be a friendly conversation rather than any kind of an inquisition. 
These, of course, are personal impressions. 

KPM evidently think highly of the virtues of this technique, because 
it was adopted despite limitations which it imposes on the scope and 
rate of progress of the study. The technique makes great demands on 
the interviewer. The long period of training and the personal qualities 
required have restricted and will continue to restrict the interviewers 
to a very small number. This limits the speed with which data can be 
accumulated and also puts restrictions on the type of sampling that 
can be employed. 

The type of interview used by KPM differs markedly from the 
less directive methods which are sometimes recommended for dealing 
with taboo subjects. If the subject is likely to feel that his answer to a 
certain question will affect his prestige in the eyes of the interviewer, 
a less directive approach would be to conduct the interview in such a 
way that he gives the desired information without realizing that he is 
answering the awkward question. The KPM method is the antithesis 
of this. Research on interviewing techniques has not yet produced any 
substantial body of evidence as to the superiority of either the less 
directive methods or the KPM technique. 

With regard to specific inaccuracies in the КРМ data, we believe that 
the interview gives an opportunity both for positive and negative bias. 
The KPM assumption that everyone has engaged in all types of ac- 

‚ tivity seems to some likely to encourage exaggeration by the respond- 
ents. (KPM feel (personal communication) that their cross-checks are 
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highly effective in detecting such exaggeration:) On the other hand, 
our impression from the interview was that a successful denial of cer- 
tain types of activity would be possible if the subject was prepared to 
do so, although we do not know the full extent of the КРМ cross- 
checks which would lead them to be suspicious of such a denial. КРМ 
assert (personal communication) that they regard cover-up as a more 
likely source of bias than exaggeration. Our opinions on this statement 
are divided. : $ 

As КРМ point out (p. 48), the subject’s willingness to talk about 
certain types of activity is influenced by the attitudes of the social 
group to which he belongs. Until evidence to the contrary is presented, 
the presumption (made by some of the critics) that his final responses 
will also be influenced is one that cannot be cast aside. Тһе size of these 
influences is still a matter of opinion. А corresponding element of doubt 
is present in almost all comparisons between different social levels, 
both those which provide some of the most interesting comparisons in 
the book, and those in many other studies. 


CHAPTER V. THE SAMPLING AREA 
21. KPM's sampled population 


As noted above, KPM's sample was deliberately disproportionate, 
partly in order to cover individual segments defined by age, education, 
religion, etc., in an adequate manner, partly because of geographical 
convenience. If the results for individual segments were to be based 
on samples of at least moderate size, such disproportion was necessary 
апа wise. Its effects on overall results are less clear. It seems impossible 
to be sure what effect it had on the variability of the final result, and 
its use is certainly not а demonstrable error as far as variability is 
concerned. 

In their U. S. corrections, KPM provided adjustments for dispro- 
portion between segments defined by age, education, and marital status. 
Ав noted above (Section 17) we feel that such adjustments are usually 
Appropriate. Due to absence of population data, they did not adjust 
for religion. The geographical imbalance of their sample was so great 
that an overall geographic adjustment was not feasible. Thus they com- 
Pensated for some disproportions, and left others to produce what 
effects they would. 

Their only examination of the sample for signs of selection within 
segments is their comparison of 100% groups (groups where all mem- 
bers were interviewed) with partial groups (groups where only part of 
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the members were sampled). This gives some insight into the effect; of 
volunteering as a selective factor. Beyond this, KPM report no serious > 
effort to measure the actual effect of volunteering, or to discover what 
percentage of the population they would be able to persuade to be inter- 
viewed. i 
They made no use of randomization. They might have attempted to 
sample, say, college seniors from two colleges drawn at random from a 
large list of colleges, but they are of the opinion (personal communica- К 
tion) that this would have slowed up the work to an unmanageable | 
extent. ч 
All in all, the absence of апу orderly sampling plan contrasts strik- | 
 ingly with their usual methodical mode of attack on other problems, 


As stated briefly above (Section 6), the "sampled populations? 4 
corresponding to 


| (1) KPM's raw means, and to 
SUN KPM's “U.S. corrected? means, 1 


respectively, are startlingly different from the composition of the U. S. E 
_ White male population. (For example, although these sampled popula- Е 4 
tions have the U. 8, average combination of education and rural- | 
urban background, they have half of their members living in Indiana.) | 

Since a complete probability sample seems to have been out of thi 
question at the beginning of the KPM investigation, some such “sam: 
pled population” was to be expected, although it might have been some 
what less distorted. Provided that further statistical analyses of the | 
sort indicated in Appendix C, Chapter II-C were made, it wor be | 
possible to make adequate rigorous inferences from the sample to | 
this ill-defined “sampled population.” 
The inference from these vague entities to the U. S. white male 
population depends on: S 
(a) the inferrer’s view as to what these “sampled populations” are 1 
really like, and ) 


(b) the inferrer's judgment as to how (reported) sexual behavior 
varies within segments, | 


Tt is not surprising that experts disagree. 
The inference from KPM’s sample to the (reported) behavior of all 
U. 8. white males contains a large gap which can be spanned only by 
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22. Could КРМ have used probability sampling? 


If probability sampling could have been used, its use would have 
avoided one of the main gaps in KPM's present chain of inference. 
We have, therefore, considered this possibility carefully. 

Тһе difficulties in applying probability sampling to KPM's study 
lie in the expenditure of time required to make the contacts necessary 
to persuade a predesignated man to give а history. By adapting the 
mechanism of the probability sample to KPM's situation, these dif- 
ficulties may perhaps be reduced (see Appendix D, Chapter V-D). 
It would almost certainly have been impractical for KPM to have used 
a probability sample in the early years of their study. If КРМ? ap- 
parent “opinions” (p. 39 of КРМ) as to the effectiveness of their pres- 
ent techniques of contact are correct, starting a probability sample 
would have been practical at any time since the appearance of the 
male volume in 1948.1 However, КРМ (personal communication, 19597 
feel that such an interpretation of their written statement is uri 
ranted, 

Since it would not have been feasible for KPM to take a large sam- 
ple on a probability basis, a reasonable probability sample would be,: 
and would have been, a small one, and its purpose would be: 


(1) to act as а check on the large sample, and 
(2) possibly, to serve as a basis for adjusting the results of the large 
sample, 


A probability sampling program planned to serve these purposes is 
disthissed in Appendix D, Chapter VII-D. Such a program should 
proceed by stages because of the absence of information on costs and 
refusal rates, 4 

This conclusion about probability sampling does not excuse KPM 
from the responsibility for choosing geographical disproportion in 
order to save travel time and expense. The wisdom or unwisdom of this 
choice seems to depend on one’s view as to the magnitude of geographi- 
cal differences. Again, it is not surprising that experts disagree. 


CHAPTER VI. METHODOLOGICAL CHECKS 


23. Possible checks 


The primary check, if it could be made, is the comparison of average 
actual behavior with average reported behavior. Variability in the dif- 
2008: PONAVIOT WILA ООО н Oe SE аы а ын 

1 “The number of persons who can provide introductions has continually spread until now, in the 
present study, we һауе а network of connections that could put us into almost өлу group with which 
we wished to work, anywhere in the country.” (P. 39 of KPM.) 


я. “ 
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ference between actual and reported behavior is secondary in interest, 
because high variability merely implies the necessity of larger numbers 
of cases, while large average differences between actual and reported 
behavior respresent a systematic error that cannot be adjusted without 
rather complete knowledge. Unfortunately this primary check does not 
at present seem feasible in studying human sexual behavior as it occurs 
in our culture. 

Of secondary importance are checks of the single actual report with 
the average actual report, where averages may be taken over fluctua- 
tions, time, spouses, and/or interviewers. (See Appendix A, Chapter 
V-A) In this second category, the following possible comparisons sug- 
gest themselves: 


1. Reinterviews of the same respondent 

2. Comparison of spouses 

3. Comparison of interviewers on the same population segment 
4. Duplicate interviews by the same interviewer at various times. 


24. KPM's checks 


The only comparison of observed and reported behavior which KPM 
found feasible was the date of appearance of pubie hair, which agreed 
quite successfully. This is a physical characteristie, different in char- 
acter and emotional loading from the behavior of main interest. Some 
Subjects may have had to rely upon general information, plus some 
assistance from the interviewer, in naming a date for themselves. 
Thus this check furnishes rather weak support. 

At the level of rechecks on respondents, some information is avail- 
able but more is needed. Similarly, comparisons of spouses have been 
made for a relatively selected group. The checks themselves are en- 
couraging, but more cases are needed. 

Some attempts have been made to compare the staff interviewers 
but since there is some selection in the assignment of cases, these 
comparisons do not meet the problem as squarely as interviews of the 
same respondent by different interviewers, or the recorded interview 
technique. 

A comparison of early versus late interviews by Kinsey is given in 
KPM, but it is hard to tell, for example, whether the 12.4% drop 
(from 44.9% to 32.5%) in the accumulative incidence for total pre- 
marital intercourse at age 19 (single males, education level 13+) from 
early to late interviews is due to differing groups sampled, instability 


in the interviewing process, or reasonable sampling variation for cluster 
sampling (KPM p. 146), 


À 
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KPM have made serious efforts to check their work in the aspects 
where checking seems feasible. However, improved and more extensive 
checking is needed. Although duplicate recording of interviews is men- 
tioned, no data have been published. Even if they must be based on 
very few cases, such comparisons should be made available. 


CHAPTER VII. ANALYTICAL TECHNIQUES 
25. Variables affecting sexual behavior 


After introductory chapters (5 and 6) on early sexual growth and 
activity, KPM proceed to examine the effects of the following vari- 
ables: 

Age 

Marital status 

Age of adolescence 

Social level 

Comparison of two generations 

Vertical mobility in the occupational scale 
Rural-urban background 

Religious background 


In this chapter we attempt to appraise, in general terms, the analytical 
techniques used by KPM in their study of these variables. 


26. Definition of the variables 


Some of the variables: age of adolescence, social level, occupational 
level, rural-urban background and religious background, involve prob- 
lems of definition. These seem to have been in the main thoughtfully 
handled and presented by KPM. For instance, KPM discuss the rela- 
tive merits of educational level attained by the subject and of the oc- 
cupational class of the subject and of his parents as à measure of social 
level (pp. 330-32). In their opinion, educational level is the most satis- 
factory criterion and this was adopted for the analysis. In the case of 
religious affiliation, KPM distinguish between active and inactive pro- 
fession of religious faith, though the definition of the two terms is not 
made entirely clear. 

The definition which looks least satisfactory is that of age of adoles- 
cence (p. 299), where the problem is formidable. The criteria employed 


by KPM appear difficult for the reader to interpret. 


27. Assessing effects of variables 


With a multiplicity of variables which may interact on each other, 
the task of assessing the importance of each variable individually is 
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not easy. Examination of the variables one by one, ignoring all other 
variables except the one under scrutiny, may give wrong conclusions, 
because what appears on the surface to be the effect of one variable 
may be merely a reflection of the effects of other variables. 

А thorough attack on this problem calls for а multiple-variable ap- 
proach in which all effects are investigated simultaneously. This re- 
quires a high degree of statistical maturity and of skill in presentation. 

The method utilized by KPM is à compromise. In general, with some 
exceptions, they regard age, marital status and educational level as 
basic variables, which are held fixed or compensated for in the investi- 
gation of each of the remaining variables. The other variables are dis- 
regarded for the moment. Although we have not examined the matter 
exhaustively, this policy seems to have been justified by events, be- 
cause KPM claim from their analyses that the other variables, with 
the exception of age at adolescence, have had relatively minor effects. 


28. The measurement of activity 


In the КРМ tables, activity is measured by "incidence" (per cent of 
the population who engage in the activity) as well as by frequency per 
week. In some tables, both mean and median frequencies are given, 
and also frequencies for the total and for the active population. There 
are advantages in presenting various measures. On the other hand, 
inspection suggests that all these measures are correlated: that is, 
to some extent they tell the same story. A complex internal analysis 
would probably show about how many measures are really needed to 
extract the information in the data and what individual measurements, 
or combinations of them, are best for this purpose. Perhaps a single 
one, or 8% most two, would suffice. As it is, both КРМ and the indus- 
trious reader have to wade through tables and discussion of a number 
of different measurements, without being clear whether anything new 
ін learned. Simplification would be pleasant, but is far from essential. 


29. Tests of significance 


i In the discussion of effects which they regard as real, KPM make 
little appeal to tests of significance. They often present standard errors 
attached to the mean frequencies for individual cells. Because sampling 
Was non-random and was by groups, these standard errors, calculated 
on the assumption of randomness, are under-estimates, perhaps by & 
substantial amount. The standard errors һауе a kind of negative vir- 
tue, in the sense that if a difference is not significant when judged 
against these errors, it would not be significant if a valid test could be 
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devised. The problem of devising а realistic estimate of the true stand- 
ard errors is one of considerable complexity (see Section II-C). 

We have been unable to discover from the book the principles by 
which КРМ decide when to regard an effect as real. The size of the 
effect is one criterion. Size should certainly be taken into account, 
since an effect may be significant statistically but too small to be of 
biological or sociological interest. They evidently attach some impor- 
tance to the consistency with which an effect is exhibited in different 
parts of a table. As a criterion, consistency is of variable worth. Con- 
sistency over different age groups (where age denotes age at the time 
of the reported activity) is of little worth, since there is inevitably 
substantial correlation between sampling fluctuations of reported 
activities at neighboring ages because the same subject appears in 
neighboring age groups. More weight can be attached to consistency 
over different educational levels, because different groups of subjects 
are involved. 

To summarize, statements about the data in their tables lie at the 
level of shrewd descriptive comment, rather than at the level of an 
attempt to make inferential statements from a sample to a clearly de- ` 
fined population (even though this could not be the U. 8. white male 
population). 

We do not propose to discuss the analysis for each variable sepa- 
rately. Two analyses which have attracted much attention will be con- 
sidered later (Sections 33 to 37). 


30. U. S. Corrections 


In most sampling plans it is necessary to provide a set of weights 
for the segments of the sampled population to recover accurate esti- 
mates for the target population (ie. the population about which 
inferences are desired). That such adjustments are usually appro- 
priate, whether probability or nonprobability samples are employed, 
has already been pointed out (Section 17, see Section II-G). 

Since KPM have as their target population U. S. white males, we can 
reasonably expect them to apply weights in an attempt to correct 
for disproportionate representation in the sampled population of some 
segments of the target population. 

KPM supply U. 8. Corrections (p. 106-9) and use them rather con- 
sistently throughout the work. There are no examples given explaining 
the application of the weights. The critics, and sometimes this commit- 
tee, have had difficulty in verifying computations where they have 
been used. Of the 13 tables where corrections could be checked com- 
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pletely, one checked, 10 checked except for one age group each, and two 
were not checked by the correction mentioned in the text. Apparently 
the exposition could be improved. 

Тһе U. S. Corrections should be used, but it might be possible to 
make a more effective choice of segments (see A-43 and V-C and II-G). 

KPM did not sufficiently warn the reader that U. S. corrected figures 
ате not corrected for selection within segments, and may be seriously 
biased. 


31. The accumulative incidence curve 


KPM have a useful device for summarizing incidence data by age. 
This accumulative incidence curve gives the percentage of individuals 
in the sample (reporting for a given age) to whom a particular event 
has occurred before that age. Although the explanation of the concept 
of accumulative incidence is not as clear as most of KPM's writing, the 
computations made are satisfactory. When there are no generation- 
to-generation changes in the population and no differential recall 
depending on age at report, this method is particularly justified, be- 
cause it packs all the incidence data neatly into one grand summary. 
(For discussion of the critics’ comments see A-39.) No better method 
for overall comparisons seems to be available. 


82. Other devices 


1. KPM did some extensive sampling experiments on their data, 
with a view to discovering the sample size needed for the accuracy 
they desired. These experiments turned out to be almost valueless 
because KPM did not take account of the necessary statistical princi- 
ples (see A-19). 

2, The committee had an opportunity to inspect the KPM facilities 
оп а visit to Bloomington, Indiana. We observed that the data sheets 
were neatly filled out, that the files were well kept, that requests for 
original data were usually met in a matter of moments, and that the 
office was well equipped for handling the extensive data with which 
KPM deal. 

3. The KPM volume was written while data were still being col- 
lected. Apparently KPM chose to use all the data on hand at the time 
4 particular point was being analyzed (personal communication from 
KPM). Thus different tables have different totals, a source of annoy- 
ance to critics and users of the book. The reasons for this should have 
been pointed out by KPM. The additional interviewing was deliber- 
ately selective with an aim to strengthen weak segments (personal 
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communication from KPM). Tt seems to us that, if this strengthening 
was necessary for later analyses, it would have been worthwhile to add 
the new material to the early tabulations. This would also have in- 
creased comparability and avoided the problems raised by the exist- 
ence of many different sampled populations. 


CHAPTER VIII. TWO COMPLEX ANALYSES 
33. Patterns in successive generations 


In this chapter we discuss briefly two analyses by KPM which have 
attracted much attention. Our object is to give two specific illustra- 
tions of the kind of analysis which they chose to undertake, with 
comments on their competence. 

The first analysis was made by dividing the sample into two groups: 
those over 33 years of age at the time of interview, with a median age 
of 43.1 years, and those under 33 years at the time of interview, with a 
median age of 21.2 years. 

Our comments deal with three topics: (i) the statistical methodology 
employed (ii) KPM’s summary of their tables (iii) the general problem 
of inference from data of this type. 


34. Statistical methods 

In the comparisons, educational level and age at the time of the 
activity are held constant and in nearly all comparisons marital status 
also. The method used to compare the group means seems satisfactory 
except for some minor points, discussed in A-25, A-33 and A-43. 

It would have been helpful to present classifications of the older. and 
younger groups according to other factors which might influence sexual 
activity, e.g., rural-urban background, religious affiliation, marital 
status at age 20 or 25. The two groups would not necessarily agree 
closely in these break-downs, for there has been a slow drift towards 
the towns, and perhaps a drift towards “inactive” rather than “active” 
religious affiliation. For interpretive purposes it is advisable, in any 
event, to learn as much as possible about the compositions of the older 
and younger groups. Some critics have claimed that the older genera- 
tion is “atypical.” 


85. KPM's summary of their tables 


The data are presented in 8 large tables (98-105). As a statistician 
learns from experience, a competent summary of a large body of data 
is not an easy task. KPM give a detailed discussion of the accumulative 
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incidence data for each type of outlet, followed by a similar discussion 
of the frequency data. 

'These detailed comments on what the data appear to show seem 
sound, except that on two occasions where the younger group showed 
greater sexual activity, КРМ ignored or played down the difference 
between the two groups (Section A-45). 

"Their general summary statement reads in part as follows: 

“Тһе changes that have occurred in 22 years, as measured by the data 
given in the present chapter, concern attitudes and minor details of be- 
havior, and nothing that is deeply fundamental in overt activity. There has 
been nothing as fundamental as the substitution of one type of outlet for 
another, of masturbation for heterosexual coitus, of coitus for the homo- 
sexual, or vice versa. There has not even been а material increase or decrease 
in the incidences and frequences of most types of activity... . 

“And the sum total of the measurable effects on American sexual be- 
havior are slight changes in attitudes, some increase in the frequency of 
masturbation among boys of the lower educational levels, more frequent 
nocturnal emissions, increased frequencies of premarital petting, earlier 
coitus for a portion of the male population, and the transferences of a per- 
centage of the pre-marital intercourse from prostitutes to girls who are not 
prostitutes.” 


Some critics have objected strongly to this statement, particularly 
the first paragraph, on the grounds that it gives a biased report by 
brushing aside the differences in activity, which are almost all in the 
direction of higher or earlier sexual activity by the younger group. 
The reporting does appear a little one-sided, in that the reader is en- 
couraged to conclude that the differences are immaterial, although 
KPM do not state what they mean by a “material” increase. On the 
other hand, the catalogue of differences, given at the end of the second 
paragraph above, includes all differences noted either by КРМ or 
the critics, except for an increased homosexual activity in the younger 
group at educational levels 0-8 and 9-12. 


36. Validity of inferences 


Two objections have been made by some critics to any inferences 
drawn from a comparison of this type. The first is that the groups шау 
not be representative of their generations. KPM have attempted to 
dispose of this objection, at least in part, by holding educational level 
and marital status constant. It might be possible to go further and hold 
other factors constant, or at least examine whether the samples from 
the two generations differ in these factors. But with non-random 
sampling the objection is not removed even if a number of factors are 
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held constant, because one or both groups might be biased with respect 
to some factor whose importance was not realized. Various opinions 
may be formed as to the strength of the objection, but it can be re- 
moved only by the use of probability sampling accompanied by valid 
tests of significance. 

Secondly, in a comparison of this type, the older generation is describ- 
ing events which involve a much longer period of recall, with a possi- 
bility of distortion as events become distant. Further retake studies, 
if KPM can continue them for a sufficiently long period, may throw 
some light on the strength of this objection. 

The joint effect of these objections is to render the conclusions tenta- 
tive rather than definitely established. 


37. Vertical mobility 


This analysis (pp. 417-47) shows a degree of ingenuity and sophisti- 
cation which is not too common in quantitative investigations in soci- 
ology. The data are arranged in a two-way array according to the oc- 
cupational class of the subject at the time of interview and the оссира- 
tional class of the parents. КРМ examine whether the pattern of sexual | 
activity of the subject is more strongly associated with the parental 
occupational class than with that attained by the subject. They con- 
clude (p. 419) 

In general, it will be seen that the sexual history of the individual accords 
with the pattern of the social group into which he ultimately moves, rather 
than with the pattern of the social group to which the parent belongs and 
in which the subject was placed when he lived in the parental home, 

The most significant thing shown by these calculations (Tables 107-115) 
is the evidence that an individual who is ever going to depart from the 
parental pattern is likely to have done so by the time he has become adoles- 
cent. 


Тһе amount of data which КРМ present in this analysis is worth 
mention as evidence that they do not shirk work. Tables are given for 
7 types of activity. Three age groups are shown in each table. When 
we classify by occupational level of subject and parent, this leads to 
21 two-way tables. Five measures of the type of activity are given, so 
that a painstaking examination extends over 105 two-way tables. А 

КРМ appear to have paid most attention to the frequency data. 
Their task is to determine whether this shows а stronger association 
with the occupational class of the subject or of the parent. In reaching 
a verdict, they rely on judgment from eye inspection. By a similar eye 
inspection, we agree with their verdict as a descriptive statement of 


706 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1958 


what the data indicate, although different individuals might disagree 
as to how definitely their statement holds. Judgments made by one 
individual for the data on frequencies were that in 7 of the 21 two-way 
tables, association with subject and parent either was not present at 
all or looked about equal. In 9 it looked mildly more with the subject 
and in 5 it looked strongly more with the subject. 

It would be of interest to undertake a more objective analysis. Analy- 
sis of variance techniques are available for this purpose, although some 
theoretical problems remain. 

So far as interpretation is concerned, the principal disturbing factor 
is the possibility, which some critics have mentioned, that the subject’s 
reports of his activity are influenced by the social level to which he 
belongs at the time of interview. KPM maintain that attitudes towards 
different types of activity are strongly affected by the social level of the 
subject. Whether they change when he changes his social level would 
be interesting to discover. Something might be learned by retakes for 
subjects who had moved in the social scale. To obtain an abundant body 
of data of this kind will, however, be a slow and difficult process. 


CHAPTER IX. CARE IN INTERPRETATION 
38. Sample and sampled population 


In sample surveys, the inference from sample to sampled population 
is often relatively straightforward, although not trivial. We can usually 
set limits so that the statement “the sample agrees with the sampled 
population within these limits” has approximately the agreed-upon 
risk. (We may have to-work fairly hard to set these limits correctly.) 
But we have always to remember, and usually must remind the reader 
steadily, that these limits are not infinitely narrow. 

КРМ? caution on page 153 (quoted in Appendix A, Section 48) is 
& caution, but it is not repeated. 

In general, their statements about small differences are more forth- 
right than we would care to make. 


39. Sampled population and target population 


When a respectable approximation of a probability sample is in- 
volved, the step from sampled population to target population is usual- 
ly short and the inference strong. Otherwise, the inference is often 
tortuous and weak. It depends on subject matter knowledge and intui- 
tion, and on other barely tangible considerations. These considerations 
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deserve to be brought to the reader's attention, and to be discussed 
as best the authors may. 2 

This КРМ did not do adequately. Their discussion of diversification 
(p. 92) and 100 per cent samples (p. 93) is only а beginning. 


40. Systematic errors of measurement 


Any quantitative study offers the possibility of systematic errors 
of measurement. It is generally agreed that these possibilities should be 
placed before the reader and discussed. 

In KPM's study these possibilities concentrate on the difference 
between present reported and past actual behavior KPM spent 
Chapter 4 on this question. Their discussion is generally good, except 
on some questions which arise in connection with generation-to- 
generation comparison (see Sections А-25 and A-44). 


41. Unsupported assertions 


We are convinced that unsubstantiated assertions are not, in them- 
selves, inappropriate in a scientific study. In any complex field, where 
many questions remain unresolved, the accumulated insight of an ex- 
perienced worker frequently merits recording when no documentation 
can be given. However, the author who values his reputation for ob- 
jectivity will take pains to warn the reader, frequently repetitiously, 
whenever an unsubstantiated conclusion is being presented, and will 
choose his words with the greatest care. KPM did not do this. 

Many of the most interesting statements in the book are not based 
on the tabular material presented and it is not made at all clear on what 
evidence the statements are based. Nevertheless, the statements are 
presented as if they were well-established conclusions. 


42. Some major controversial findings 


Some KPM findings about which much scientific discussion has cen- 
tered relate to: 
(i) stability of sexual patterns, 
(ii) homosexuality, and 
(iii) the effects of vertical mobility. 


In all these areas KPM have made forthright and bold statements. 
As discussed in more detail in Sections А-45 to A-47 (also see A-25), 
there are reasons for caution in every one of the three areas. 
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CHAPTER X. COMPARISON WITH OTHER STUDIES* 


43, Interviewing 


Good sex studies have been made using both the personal interview 
and questionnaire techniques. Given that just one technique is to be 
employed, KPM's choice of personal interview seems necessary if 
illiterates or near-illiterates are to be sampled. At present, it is good 
practice in gathering this type of data to endeavor to have all subjects 
give information on as many relevant points of the study as possible. 
No study seems to have done better on this matter than KPM. 

Whether it is always good practice to standardize the questions 
asked is debatable. KPM did not do this and give telling arguments 
against.the practice. Some other studies have standardized the ques- 
tions, both in personal interview and in self-administered question- 
naires, and they have included good arguments in favor of their pro- 
cedure. In training interviewers КРМ seem to have gone to greater 
lengths (a year of training) in preparing for the specific interview used 
inthe study, than any of the other personal interview studies. Informa- 
tion on training of interviewers is fairly hard to come by in all these 
studies. j 

Given the choice of personal interview, it is not possible at this writ- 
ing to be logically certain whether the KPM technique is better or 
worse than that of the other interview studies, no matter whether one 
approves or disapproves of the tactics of a diagnostician or medical 
detective. Some discussion of how the KPM interview appeared to 
us is given in Appendix E. Numerous cross-checks on frequency and 
dates of occurrences appear within the KPM interview, while they 
seem to be lacking in most other studies. Setting aside points on which 
there is no evidence, KPM’s interviewing is as good as or better than 
that of the other studies reviewed. 


* The material in this chapter is our inference from the reviews supplied by W. O. Jenkins and 
GUN in Appendix B. We have not personally read all the volumes concerned. The volumes are аз 
Darley, Dorothy D., and Britten, Florence Н. Youth and sez. New York: Harper and Brothers, 1038. 

deste B. Factors in the sez life of twenty-two hundred women. New York: Harper and Brothers, 
тры R. L., and Beam, Lura A. The single woman. Baltimore: Williams and Wilkins Co., 1934. 
Pans 2 ae and Beam, Lura А. А thousand marriages. Baltimore: Williams and Wilkins Co., 1981 
Tout Д 2 итап fertility and problems of the male. White Plains, N.Y.: Author's press, 1950. 
X i ‚ У. A research in marriage. New York: A. and С. Boni, 1929. 
У, А. O., Pomeroy, W. B., and Martin, C. E. Sexual behavior in the human male. Philadelphia: 

W. B. Saunders Company, 1948. 

кш o Eos ees z ae New York and London: Paul B. Hoeber, 1940. 
inert EAE Pareonalify ond serai i i . New 

‘York anid Lodo Peal neti а o sexuality of the physically handicapped woman. 

Terman, L. M., et al. Peychological factors їп marital happiness. New York: McGraw-Hill Book Co., 1988, 
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44. Checks : 


As for checks on the interviewing process, KPM unquestionably 
lead the field with 100 per cent samples, retakes, spouse comparisons, 
early vs. late groups, interviewer comparisons, and the pubic hair study. 
Some authors mention casual checks with no data supplied. Bromley 
and Britten compare interview and questionnaire results on different 
groups. Davis reports a study where 50 subjects were interviewed 
before and after questionnaire administration, and offers a breakdown 
by consecutive 100 questionnaires received. Dickinson and Beam’s 
two books speak of comparing verbal reports and physical examination 
results as a way of verifying the record rather than as a check—no 
records seem to be published. Farris’ comparison of reported vs. per- 
sonally recorded masturbatory rates omits the critical comparative 
information. Hamilton finds that different question wordings give 
different responses, but leaves the matter here. Landis and Bolles use 
several independent judges for evaluation of scales—but, instead of 
comparing their results, argue that agreement will be good because of 
experience and training. They do not compare normal with handicapped 
subjects. Landis checks with the psychiatric case history as a means 
of eliminating subjects with discrepancies, and gives data on the agree- 
ment of independent judges’ ratings. Terman offers spouse comparisons, 
When KPM’s checks are viewed with those of the other leading sex 
studies in mind, it is clear that a new high level has been established. 


45. Sampling ; 


Allstudies used volunteer non-probability samples. Some were drawn 
from more specifiable target populations than others. For example, 
Bromley and Britten drew exclusively from college volunteers, while 
Davis used mail-questionnaire respondents from lists of Women’s 
Clubs and college alumnae. Others used well-to-do patients, or clinic 
groups. Aside from KPM, Bromley and Britten is the only study that 
seems to have attempted to get nationwide geographic representation 
(we have omitted M. J. Exner’s 1915 study), while Davis has covered 
the eastern area, and Terman covers part of the California area. Al- 
though KPM’s sample is heavily charged with college students, a 
broader representation of social and educational levels is offered than 
in the other studies. All studies reviewed have special features which 
make generalizations to specific populations difficult. Certainly KPM's 
sampling seems never worse and often better than that of the other 
studies, 
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46. Analysis 

Most studies confined their analysis to simple descriptive statistics— 
percentages, means, and medians. А few added ranges, standard devia- 
tions, correlation coefficients, and attempted significance tests. About 
half used two-way breakdowns, usually on background characteristics, 
as а way of sharpening differences between groups. Three studies 
offered scales either based on judges’ evaluations (Landis, and Landis 
and Bolles), or scoring of batteries of items (Terman). КРМ restricted 
the use of scales to occupational classification and homosexual-hetero- 
sexual rating. They added the accumulative incidence curve, the U. 8. 
corrections, and extensively used fine-grained (high-order) breakdowns. 
In general, KPM's analysis employed more devices and was more 
searching than the analyses offered by other studies. 


47. Interpretation 


We have already mentioned (33) that KPM are competent at the 
accurate and understandable verbal description of the meanings of a 
table whose entries are taken as correct. Some of the other authors 
have also done well, although the extent of their analysis is usually 
more limited. In inferring from sampled population to target popula- 
tion, all the studies are weak. The inferences left with the reader (if 
we are to judge) are much broader than the studies could possibly 
warrant. Every study has its own precautionary remarks to the effect 
that the reader must not extend the inferences beyond that of the 
population studied. Very little attempt is made to describe the target 
population, to help the reader with the step from sample to sampled 
population, or to remind him of sampling fluctuations. The precaution- 
ary remarks in the opening pages of a study are usually forgotten when 
the authors come to discuss matters of national policy, morals, legisla- 
tion, therapy, and psychological and sociological implications toward 
the end of their book. The reader must then be left with the inference 
that the findings apply on at least a national scale. Bromley and Britten 
are more forthright than most. They argue overtly that their volunteer 
college sample is a representative of all U. S. individuals of college аре. 
Of the 10 studies considered, only two, Davis and Farris, seem to have 
consistently exercised due caution about generalization from sample 
to population and warnings to the reader. The last paragraph of the 
section entitled, “Description of Sample and Sampling Methods” 
in each review in Appendix B gives one reader's opinion of the general- 
izations from sample to sampled population intended by the author. 

Our reviewer was not asked to gather data that would give us a way 
of comparing the extent of unsupported statements in the other stud- 
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ies with those of KPM, so this aspect of interpretation remains uncom- 
pared by us. It would be very interesting if someone would collect 
such information, not only in connection with the present work, but 
with regard to general scientific writing in various fields. This would be 
no small task. 

CHAPTER XI. CONCLUSIONS 
48. Interviewing 


(1) The interviewing methods used by KPM may not be ideal, but 
no substitute has been suggested with evidence that it is an improve- 
ment. $ 

(2) The interviewing technique has been subjected to many criti- 
cisms (see Section A-11), but on examination the criticisms usually 
amount to saying “answer is unknown,” or “KPM have not demon- 
strated how good their method is.” 

These conclusions can be summarized by saying that we need to know 
more about interviewing in general. 


49. Checks 


(1) The types of methodological checks considered by KPM seem 
to be quite inclusive. 

(2) A greater volume of checks—more retakes, etc, is desirable, as is 
more delicate analysis. (See Sections C-15 and C-18.) 

(3) The results of duplicate recording of interviews should be pub- 
lished. 

These conclusions can be summarized by saying that KPM’s checks 
were good, but they can afford to supply more. 


50. Sampling 


Given U. 8. white males as the target population, our conclusions 
are that: 

(1) КРМ“ starting with a nonprobability sample was justified. 

(2) It should perhaps already have been supplemented by at least 
a small probability sample. 

(3) If further general interviewing is contemplated, and perhaps even 
otherwise, a small probability sample should be planned and taken. 

(4) In the absence of a probability-sample benchmark, the present 
results must be regarded as subject to systematic errors of unknown 
magnitude due to selective sampling (via volunteering and the like). 


51. Analysis 
КРМ» analysis is best described as simple and relatively searching. 
They did not use such techniques as analysis of variance or multiple 
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regression, but they brought out the indications of their data in a work- 
manlike manner. : 

In more detail: 

(1) their selection of variables for adjustment seemed to be a reason: 
ably effective substitute for more complex analyses, 

(2) they gave several measures of activity (giving the reader а choice 
at the expense of more tables to examine), с 

(3) they made essentially no use of tests of significance, but cited. 
many standard errors (which were inappropriate for their cluster sam- 
ples), : 

(4) they used-U. S. Corrections and their (independently developed) | 
accumulative incidence curve. More careful exposition of these devices 
would have been desirable. 

То summarize in another way: 


(i) they did not shirk hard work, and Г 
(ii) their summaries were shrewd descriptive comments rather than | 
inferential statements about clearly defined populations. 


Their main attempt at inferences was а sample size experiment whose | 
results (i) could have been predicted by statistical theory, (ii) were. 
irrelevant to their cluster sampling. 

They continued to add new interviews without redoing earlier tabu- 
lations, thus producing an unwarranted effect of sloppiness in the book, 
although their records were kept carefully and in unusually good 
shape. р | 


52. Interpretation 


(1) КРМ showed competence in accurate and understandable ver- | 
bal description of the trends and tendencies indicated by their tables 
In stating and summarizing what the sample seems to show, the 
were competent and effective. 

(2) Their discussion of the uncertainties in the inferences from the 
numbers in the tables to the behavior of all U. S. white males was brief, 
insufficiently repeated, and oftentimes entirely lacking. In instilling 
due caution about sampling fluctuations and differences between 
sampled and target populations, they were lax and ineffective. 

(3) Their discussion of systematic errors of reporting is careful and 
detailed (with the exception of some questions bearing on generation | 
comparisons). ; - 

(4) Many of their most interesting statements are not based on the | 
tables or any specified evidence, but are nevertheless presented as 
well-established conclusions. Statements based on data presented, 
including the most important findings, are made much too boldly and ; 
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confidently. In numerous instances their words go substantially beyond 
the data presented and thereby fall below our standard for good scien- 
tific writing. 

58. Comparison with other studies i 

In comparison with nine other leading sex studies, KPM's work is | 
outstandingly good. 

In more detail, 

(1) their interviewing ranks with the best, 

.(2) they have more and better checks, 

(3) their geographic and social class representation is broader and 
better, 

(4) their volunteer non-probability sample problem is the same, 

(5) they used more varied and searching methods of analysis, 

(6) only two of the nine studies (Davis and Farris) were more care- 
ful about generalization and warned the reader more thoroughly 
about its dangers. 

Thus, KPM’s superiority is marked. 

54. The major controversial findings 

It is perhaps fair to regard these four as KPM's major controversial 
findings: 

(1) a high general level of activity, including a high incidence of ho- 

mosexuality, 1 

(2) a small change from older to younger generations, | 

(3) a strong relation between activity and socio-economic class, 

(4) relations between activity and changes of socio-economic class. 

All of these KPM set forth as well established conclusions. All are 
subject to unknown allowances for: i 

(a) difference between reported and actual behavior, 

(b) nonprobability sampling involving volunteering. 

While their findings may be substantially correct, it is hard to set 
any bounds within which the truth is statistically assured to lie (see 
Appendix A, Section 4.) Once again, we wish to point out that the same 
difficulties are present in many sociological investigations. 


CHAPTER XII. SUGGESTED EXTENSIONS 
55. Probability sampling А 
Appendix D discusses the advantages, possibilities and difficulties 


of probability sampling in some detail. 
In brief summary: 
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(1) Costs and refusal rates together determine the wisdom of exten- 
sive probability sampling. 

(2) Information on costs and refusal rates is lacking. 

(3) Hence probability sampling should begin on a very small scale, 
say 20 cases. 

(4) A step-by-step program, starting at such a scale, seems wise, and 
is recommended to КРМ. 


56. Retakes 

While retakes showed high agreement on vital statistics, and moder- 
ately high agreement on incidence, the data presented in KPM for 
frequencies show considerably less agreement. The data do not make 
clear how much better a retake agrees with a take than with a randomly 
selected interview for another subject with the same age, religion, 
social class, etc. 

If the agreement is better, then retakes will provide evidence as to 
non-random agreement—evidence bearing on the much-discussed sub- 
ject of the constancy of recall. In addition, take-retake differences are 
clearly so large as to make retakes of two old subjects at least as valu- 
able as a take of one new subject in determining the average behavior 
of groups (see Section A-24). 

If the agreement is no better, then retakes will provide evidence 
that this was so, and every retake will be as valuable as a new take in 
determining the average behavior of groups. 

Tn our opinion 500 retakes would help the standing of KPM’s data 
more than 2000 new interviews (selected in the same old way). It would 
of course be important to determine and report the selective factors 
which influenced the selection of the retaken subjects. 


57, Spouses 


Separate interviews of husband and wife are a useful supplement to 
retakes, in that they supply the nearest approach to two independent 
reports of the same action, although the information is restricted for 
the most part to marital coitus, and is weakened by the possibility 
of collusion. In the book, KPM present comparisons for 231 pairs of 
spouses. 

In an expansion of this program, various elaborations could be sug- 
gested. The first objective should probably be to interview more pairs 
from the lower educational levels, in order that the agreement between 
spouses сап be examined separately for different educational levels. 
As in the case of retakes, the data are not wasted so far as the main 


study is concerned, since they contribute both to the male and female 
samples. . ў 


——— ————————— Ui 
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58. Presentation 


As the crities point out (Chapters VII-A, I-C), parts of the book 
аге hard to understand because of lack of clarity of presentation. In 
future editions, the following steps would remove the major ambigui- 
ties. 

(i) КРМ should explain why the numbers of cases change erratically 
from table to table. In future publication it would be worth substantial 
effort to avoid these changes. 

- (ii) Table headings and contents should be critically reviewed as to 
their lucidity. 

(iii) Worked examples of the calculation of U. S. corrections should 
be given. References under the tables to the variables used for correc- 
tion should be more precise. 

(iv) More discussion should be given, with numerical illustration, 
of the meaning of accumulative incidence percentages. 

(v) More information should be given about the questions asked, 
with their variations, in the interview. Although this would be extreme- 
ly laborious to do for the complete interview, one or two blocks of re- 
lated questions might serve the purpose. For such a block, KPM might 
describe (a) the variations used in the statement of the questions (b) 
the variations in the order of questions (c) the reasons for the varia- 
tions. An illustration of this type would give deeper insight into the 
logical structure of КРМ’ interviewing technique and might go far 
to substantiate their claim (p. 52) that flexibility is one of the strengths 
of their technique 

(vi) Several critics make a strong plea that more information be 
given about the composition of the sample (see Chapters I-A, I-C). 
The specific items requested vary with the critic, and some would be 
a major undertaking both in preparation and publication. A minimum 
that: seems feasible would be to present a multiple classification of the 
subjects according to the following items at the time of interview: age, 
marital status, occupation, educational status, religious affiliation, place 
of residence. In addition, more information is needed about the extent 
to which special groups (e.g., those in penal institutions, homosexual 
groups) contribute to the tables. 


59. Statistical analyses 

In Appendix C, a number of statistical analyses are outlined which 
would be a useful contribution to the methodology of studies of this 
kind. The analyses would require expert statistical direction. 
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As has been pointed out, the standard errors presented by KPM 
are invalid, because they were computed on the assumption of random 
sampling of individuals. A method for calculating standard errors so 
ав to take into account the actual nature of KPM's sampling is given 
in Chapter II-C. These standard errors would allow a realistic appraisal 
of the stability of KPM's means. They would indicate by how much 
the means determined from the present KPM sample are likely to vary 
from the means of a much larger sample of cases obtained by the KPM 
methods. 

KPM described orgasm rates in terms of per cent incidence and mean 
or median frequency. However, other mathematical functions of these 
variables may be more appropriate, leading to simpler statements of 
the results. Approaches for investigating this question, and the related 
question of the use of some combination of the variables, are suggested 
in Chapters III-C and IV-C. | 

Тһе question of applying adjustments to segment means has already 
been discussed (Section 17). A technique is presented (Chapter V-C) 
for reaching practical decisions on the appropriateness of adjustment 
and on the number of variables for which adjustment should be made. 


60. Relative priorities 


We give here our personal collective opinion as to how further effort 
on the male study might best be spent (we have not tried to evaluate 
priorities in comparison with the female study, or any other studies 
which КРМ may contemplate). 

If the interviewer time which it would require were available, we 
believe that the effort required for the proposed probability sample 
would be worthwhile. 

So long as it did not interfere with the possibility of a probability 
sample, available interviewer time should be concentrated: 


on retakes when working in or near old areas. 
on husband-wife pairs when two interviewers are available. 


If the probability sample has already been ruled out, and if fewer 
interviewer months are available, then an attempt to retake a random 
sample of previous subjects would be most desirable, whenever possible, 
husband and wife being taken whenever either is retaken. 

: Effort in the form of statistical analysis and presentation need not 
interfere with interviewing, and should be pressed to the extent that 
experienced and understanding personnel can be found. 


| 


THE INVENTORY PROBLEM* 


J. LADERMAN, Office of Naval Research 
5. B. Lrrrauzn, Columbia University 
LrioxEL Weiss, University of Virginia and Cornell University 


HIS article is expository, and is based on the two papers, “The 
Inventory Problem," by A. Dvoretzky, J. Kiefer, and J. Wolfowitz, 
which appeared in the April 1952 and July 1952 issues of Econometrica. 
These papers are too advanced mathematically to be read by many 
of those to whom the results might be of interest. It is hoped that this 
paper will bring the new technique to the attention of those persons, 
both in government and private industry, who are responsible for mak- 
ing decisions affecting the amount of inventory to be held by their 
organizations. In the opinion of the present authors, great economies 
can result from the application of this new inventory control technique. 
The inventory problem can be stated very simply: it is to decide 
how much material to stock in preparation for an uncertain future. 
Both understocking and overstocking are costly, else there is no prob- 
lem. If overstocking is not penalized, such large stocks could be held 
that no conceivable future occurrence would deplete them; if under- 
stocking is not penalized, zero stocks could be held. The usual cases, 
where both understocking and overstocking are costly, are the ones of 
interest here. For example, the proprietor of a restaurant, buying 
perishables for the day, will see them spoil if he buys too many, or will 
turn customers away ungatisfied if he buys too few, thus failing to earn 
potential profits and perhaps permanently losing some customers. Even 
if а merchant does not deal in perishables, overstocking may involve 
carrying costs which include such items as rent, insurance, deprecia- 
tion, loss of interest on capital invested, etc. As a less homely example, 
an army would certainly be heavily penalized for being caught short of 
ammunition, but since there are other important items needed by an 
army, it would be possible to stock too much ammunition at the 
Sacrifice of other military items. : 
The reader can no doubt think of other cases, closer to home, where 
à balance must be struck between overstocking and understocking. 
The purpose of this article is to describe a method of striking this bal- 
ance so as to minimize the losses to be expected from taking the risks 
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of overstocking or understocking, which are unavoidable when one 
has to provide for an uncertain future demand. 

As a first step, we describe a simple but important concept, the 
“schedule of losses." The schedule of losses is а schedule which shows 
what the loss is for any given combination of stock held and future de- 
mand. А profit is regarded as a negative loss. The schedule of losses can 
often be simply expressed by а mathematical formula. For example, 
suppose a newspaper vendor buys y papers from the publisher for 3 
cents à copy, sells d copies for 5 cents a copy, and resells the unsold 
copies to the publisher for 1 cent a copy. Then his loss in cents-is 
—2d--2(y —d), because he makes 2 cents profit on each of the d pa- 
pers he sells and loses 2 cents on each of the (y—d) papers he returns, 
where of course the number sold to customers, d, cannot exceed the 
number bought from the publisher, y. Thus for any possible combina- 
tion of numbers of papers bought and sold, we can compute the loss 
incurred by the vendor. The schedule of losses іп this case is typical 
of many situations where unsold stock depreciates in value (perhaps 
even becomes worthless). 

The schedule of losses includes losses arising from the four following 
categories: 


в) Negative of the profit from a transaction or other gain from the completion 
of а mission. 


b) Carrying costs which are the losses arising from the stocking of the com- 
modity. 

с) Losses due to depletion which arise when the demand exceeds the available 
supply. 

d) Ordering costs which are the costs involved in processing an order to 
change the inventory level. 


In the above newspaper example the schedule of losses involved only 
items a and b. The —2d represented the negative of the profit from the 
sale of the d papers and the 2(y—d) represented the loss due to ob- 
solescence of the papers (a carrying cost). 

Я Tt is almost inconceivable that а person responsible for making deci- 
sions would not have a fairly good idea of what the schedule of losses 
is for his case. In the absence of any knowledge at all about the sched- 
ule of losses, it is difficult to imagine on what rational grounds the size 
of inventory can be set. From now on, we shall assume that the sched- 
ule of losses is known, at least approximately, and shall then describe 
a method of choosing the size of inventory to be held. 

First we discuss a particularly simple case where stock can only be 


"a 
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ordered or returned to the supplier (a negative order) at the beginning 
of a time interval, and only the stock held after the order is placed can 
be used to meet the demand that will arise during the interval. We as- 
sume that there is instantaneous delivery of the order, and also that the 
future demand is completely known. It is this last assumption that ' 
makes this case so simple, for once the schedule of losses is known and 
the future demand is known, we simply place an order of a size that will 
minimize the loss. Thus, in the case of the newspaper vendor above, it 
is clear that the number of copies he should buy from the publisher is 
the number of copies he will be able to sell (i.e. the total demand). For 
he loses 2 cents on each unsold paper, but makes 2 cents on each paper 
he sells. And in any other case where the future demand is known, it 
is а simple matter to choose an order that will minimize the loss. 

It is when the future demand becomes uncertain that we meet diffi- 
cult and more realistic cases. First we discuss how we shall interpret 
"uncertain future demand." We will not interpret this as meaning com- 
plete lack of knowledge about future demand, nor, obviously, do we 
mean that we know exactly what the future demand is going to be. 
"Uncertain future demand" to us shall mean something between com- 
plete lack of knowledge and complete certainty; namely, that future de- 
mand is a chance variable with а known probability distribution. In 
other words, future demand may have any one of several values, with 
known probabilities. 'The reader will perhaps inquire under what cir- 
cumstances we would be justified in regarding demand as a chance 
quantity. We will not try to give a complete answer here, but will note 
that demand may depend on various factors of a chance nature, there- 
by making demand itself a chance quantity. For example, demand may 
depend upon the weather, which itself is frequently considered as 
though it depends on chance. : 

То make these ideas more specific, let us take the case of the news- 
paper vendor discussed above. Suppose he is located in a suburban rail- 
road station, and that each morning there are 200 customers who reach 
the station early enough to buy a paper from him, Another 50 potential 
customers arrive at the station in a bus. If the bus arrives early, each 
of the 50 buys a paper, but if the bus arrives late, none of the 50 has 
time to buy a paper. Let us assume that the bus arrives late half of the 
time, and there is no way of telling beforehand on any day whether or 
not the bus will be late. Then it is clear that the demand for the ven- 
dor’s papers on any given day is a chance variable which can take the 
value 200 with probability 4, or 250 with probability +. This means 
that, in the long run, on $ of the days the demand will be for 200 pa- 
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pers, on the other $ of the days, for 250 papers. How many papers 
Should the vendor buy from the publisher in this case? 

The vendor's loss will be a chance variable because it will depend 
upon the demand, itself а chance variable. The probability distribution 
of the loss will depend upon the number of papers the vendor buys 
from the publisher. (We remind the reader that the distribution of a 
chance variable is simply a list of the possible values of the chance vari- 
able with their respective probabilities.) Roughly speaking, it is clear 
that the size of the vendor's purchase from the publisher should be such 
эз to make the probabilities of large losses small. This rather vague 
statement, however, is not explicit enough to enable us to decide just 
what the size of the vendor's purchase should be. Many different inter- 
pretations can be made, but throughout the remainder of this paper we 
are going to choose the size of the order to make the expected value of 
the loss as small as possible. A justification for using the procedure 
which minimizes the expected value of the loss is that such a procedure 
is the best one to use if one wishes to minimize the average loss in the 
long run. In the next paragraph we shall review briefly the concept of 
expected value. 

If one takes many observations on a chance variable, the average of 
the observations will ordinarily tend to some number. That number is 
called the expected value of the chance variable, More precisely, for 
our purposes the expected value of a chance variable may be defined 
as the weighted average of all the values the chance variable can take 
on, with the probability of each value as its weight. For example, sup- 
pose a chance variable can take on the values, 1, 2, or 3 with probabili- 
ties of 4, $, $ respectively. Then the expected value is given by 3(1) 
+3(2)+2(8) =2. Clearly, if many observations are made on this chance 
variable, about 3 of them will have the value 1, about 4 of them will 
have the value 2, and the remaining ones will have the value 3, so the 
average of all the observations will usually be close to 2. 

It is clear from the discussion of the preceding paragraph that choos- 
ing the size of the order to minimize the expected value of the loss is 

_ hot an unreasonable procedure, since the smaller the probabilities of the 
larger losses, the smaller the expected value of the loss. Also, if such à 
policy is applied over and over, the average loss will usually be less than 
that obtained from any other policy. 

Before we actually compute the size of the order that will minimize 
the newspaper vendor's expected loss, we shall discuss a possible objec- 
tion to our whole procedure. The practical definition of probability is in 
terms of the “long-run.” That is, when we say that the probability of 
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an event is $, we mean that in a long series of experiments or trials, 
the event will occur about $ of the time. Therefore, it might be asked, of 
what use is probability theory, whose statements in practice refer to the 
long run, in a problem like that of the newspaper vendor, who is con- 
cerned with his loss in one particular day? In many cases, no answer to 
this objection is necessary, since we will be dealing with a long series of 
trials. Even our newspaper vendor will presumably be trying to sell 
his papers day after day under the same circumstances, so that min- 
imizing his expected loss for one day is equivalent to minimizing his 
average loss per day over all the days he will be selling papers. In those 
cases where there will not be a long series of trials, an answer to the 
objection might be that even though, in practice, the probability of an 
event in one trial is the proportion of times it would occur in а long 
series of identical trials, even if only one trial were made, the higher 
the probability of the event, the greater would be our confidence that 
the event would occur in that one trial. For example, if one were told 
that he will be executed if he draws а red card from a deck, he would 
certainly prefer to draw the card from a deck of 51 black cards and 1 
red card rather than from a regular deck of cards. 

Returning now to the newspaper vendor, we want to find how many 
newspapers he should buy from the publisher in order to minimize his 
expected loss, where his schedule of losses (from above) is—2d+-2(y—d) 
—2y — 4d, and the number of customers who will seek to buy a paper 
is а chance variable with possible values of 200 or 250, each with 
probability 4. If the vendor buys 200 or fewer papers from the pub- 
lisher, all the copies will be sold to customers, none resold to the pub- 
lisher, so the loss will be minus twice the number bought from the 
publisher, namely —2y. From this it is apparent that no fewer than 
200 copies should be bought from the publisher, for the loss decreases 
a8 the number bought increases from zero to 200. Also, it is clear that no 
more than 250 papers should be bought, for the total in excess of 250 
will surely have to be resold to the publisher at a loss of 2 cents each. 
So the proper number to order to minimize the expected loss is between 
200 and 250 inclusive. Then the loss will be either 2y—4(200) with 
probability 4 (that is when the bus is late), or else 2y—4y = — 20 with 
probability 4 (this is when the bus is not late, making d= y). Therefore 
the expected loss when the number bought from the publisher is be- 
tween 200 and 250 is equal to 3(2y—800) --3(—2y) which equals minus 
400 cents. Thus it turns out that the expected loss is the same for any 
order between 200 and 250 and it is greater for any other order. Just 
to be specific, let us agree that whenever more than one order will 
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achieve the minimum expected loss, we will choose the smallest of the 
orders. Thus in this case we would order 200 papers. 

We shall now give another example to illustrate how this method can 
be applied to an actual inventory problem of the Navy. There are cer- 
tain rather expensive items (some costing over $100,000 each) known 
ав "insurance spares" which are generally procured at the time a new 
class of ships is under construction. These spares are bought even 
though it is known that it is very unlikely that any of them will ever 
be needed and that they cannot be used on any ship except those of 
that particular class. They are procured in order to provide insurance 
against the rather serious loss which would be suffered if one of these 
spares were not available when needed. Also, the initial procurement 
of these spares is intended to be the only procurement during the life- 
time of the ships of that class because it is extremely difficult and costly 
to procure these spares at a later date. The present policy is to order 
quantities of these spares according to the following schedule: 


Total number of Number of spares 
items installed ordered 
1-4 1 
5-50 2 
51-100 3 
over 100 4 


This particular ordering policy is based on the judgment of personnel 
familiar with the expected usage rate of such technical spares and also 
experienced with procurement policies of the Navy. However, they 
will admit that the construction of such a table is largely an intelligent 
guess and that the quantities shown to be ordered cannot be justified 
objectively. On the other hand, the procedure to be given below, based 
on the previous discussion, will lead to an objective method of con- 
structing such a table. Moreover, by using this procedure the total loss 
over a long period of time will ordinarily be less than that obtained from 
any other ordering policy. 

Let us suppose N ships are being constructed of a certain class con- 
taining an item of the type described above, for which spares cost P 
dollars each. Let p; represent the probability that exactly ? spares will 
be needed as replacements during the lifetime of the № ships; that is, 
Pı is the probability that exactly one spare will be needed, p; is the 
probability that exactly two spares will be needed, etc. Let us also as- 
sume that the probability of 5 or more spares being needed is zero. Let 
L dollars be the loss (usually quite large) suffered for each spare that is 
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needed when there is none available in stock. In obtaining the schedule 
of losses we shall neglect all the smaller losses and include only the cost 
of the spares (which become worthless if never used) and the depletion 
loss occurring when spares are needed but not available. Then for any 
dy, there is no loss from depletion, so the schedule of losses would 
simply be the number bought multiplied by the unit price which is yP. 
For 4> y, the schedule of losses would be yP+(d—y)L because (d—y) 
is the number of spares needed but not available, and L is the loss in- 
curred for each one short. Hence, the schedule of losses, which is de- 
noted by W(y, d), is 
Wy, d)=yP for dSy 
=yP + (d — y)L for d y. 


Now let us get the expected value of the loss, EW (y, d), for each value 
of y from y —0 to y=4. These are the only values of y which need be 
considered because if y is greater than 4, the loss will surely be greater 
than the loss for y=4. For у=0 we have that d2y for all possible 
values of d, hence 


EW(y, d) = 0-P + [Prob. d = 0](0 — 0)L + [Prob. d = 1](1 — 0)L 
+ [Prob. d = 2]@ — 0)L + [Prob. d = 3](8 — 00 
+ [Prob. d = 4](4 — 0)L 
= (mi + 2p + 8р + 4р) for у = 0. 
In a similar manner, we get 
EW (у, d) = P + (p + 2p: + 3p)L fory = 1 


= 2P + (ps + 2p)L fory = 2 
= 3P + pL fory = 3 
= 4Р for y = 4. 


Now for any given values of P, L, and p; it is a simple matter to tabu- 
late the values of EW(y, d) for the different values of y in order to de- 
termine which value of y gives the smallest expected loss. For example, 
suppose Р = $100,000, L=$10,000,000, pı=.04, pi .01, р:=.001, Pa 
=.0002, and the probability of 5 or more spares being needed is zero 
(hence po=.9488), then 

for y=0, EW(y, d) = (.04-+.02+.003+.0008) (10,000,000) = $638,000 
for y=1, EW(y, 4) =100,000--(.01--.002--.0006) (10,000,000) 


=$226,000 
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achieve the minimum expected loss, we will choose the smallest of the 
orders. Thus in this case we would order 200 papers. 

We shall now give another example to illustrate how this method can 
be applied to an actual inventory problem of the Navy. There are cer- 
tain rather expensive items (some costing over $100,000 each) known 
as "insurance spares" which are generally procured at the time a new 
class of ships is under construction. These spares are bought even 
though it is known that it is very unlikely that any of them will ever 
be needed and that they cannot be used on any ship except those of 
that particular class. They are procured in order to provide insurance 
against the rather serious loss which would be suffered if one of these 
spares were not available when needed. Also, the initial procurement 
of these spares is intended to be the only procurement during the life- 
time of the ships of that class because it is extremely difficult and costly 
to procure these spares at a later date. The present policy is to order 
quantities of these spares according to the following schedule: 


Total number of Number of spares 
items installed ordered 
1-4 1 
5-50 2 
51-100 3 
over 100 4 


This particular ordering policy is based on the judgment of personnel 
familiar with the expected usage rate of such technical spares and also 
experienced with procurement policies of the Navy. However, they 
will admit that the construction of such a table is largely an intelligent 
guess and that the quantities shown to be ordered cannot be justified 
objectively. On the other hand, the procedure to be given below, based 
on the previous discussion, will lead to an objective method of con- 
structing such a table. Moreover, by using this procedure the total loss 
over a long period of time will ordinarily be less than that obtained from 
any other ordering policy. 

Let us suppose N ships are being constructed of a certain class con- 
taining an item of the type described above, for which spares cost P 
dollars each. Let p; represent the probability that exactly $ spares will 
be needed as replacements during the lifetime of the N ships; that is, 
Pı is the probability that exactly one spare will be needed, pz is the 
probability that exactly two spares will be needed, etc. Let us also as- 
sume that the probability of 5 or more spares being needed is zero. Let 
L dollars be the loss (usually quite large) suffered for each spare that is 
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needed when there is none available in stock. In obtaining the schedule 
of losses we shall neglect all the smaller losses апа include only the cost 
of the spares (which become worthless if never used) and the depletion 
loss occurring when spares are needed but not available. Then for any 
dy, there is no loss from depletion, so the schedule of losses would 
simply be the number bought multiplied by the unit price which is P. 
For dz y, the schedule of losses would be yP+-(d—y)L because (d— y) 
is the number of spares needed but not available, and L is the loss in- 
curred for each one short. Hence, the schedule of losses, which is de- 
noted by W(y, d), is 
(у, d) -yP for dSy 
=yP + (а — y)L for dSy. 


Now let us get the expected value of the loss, EW (y, d), for each value 
of y from y —0 to y=4. These are the only values of y which need be 
considered because if y is greater than 4, the loss will surely be greater 
than the loss for y=4. For y=0 we have that d2y for all possible 
values of d, hence 
EW(y, d) = 0-P + [Prob. 4 = 0](0 — 0)2 + [Prob. 4 = 1](1 — 0)L 
+ [Prob. d = 2](2 — 0)L + [Prob. d = 3](3 — 0)L 
+ [Prob. d = 4](4 — 0)L 
= (pi + 2p: + 3p: + 4p)L for y - 0. 
In a similar manner, we get 
EW(y, d) = P + (px + 2ps + 3р) for y = 1 


= 2P + (ps  2p))L fory = 2 
= 3P + pl fory = 3 
= 4P fory = 4. 


Now for any given values of P, L, and p;, it is a simple matter to tabu- 
late the values of ÆW (y, d) for the different values of y in order to de- 
termine which value of y gives the smallest expected loss. For example, 
suppose Р = $100,000, L=$10,000,000, рі--.04, p:—.01, p:= 001, pi 
=.0002, and the probability of 5 or more spares being needed is zero 
(hence po=.9488), then 
for y=0, EW (y, d) = (.04--.02--.003 +.0008) (10,000,000) = $638,000 
for y=1, EW(y, d)=100,000+(.01+.002-+.0006) (10,000,000) 

= $226,000 
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for y=2, EW (y, d) —200,000-1- (.001-1-.0004) (10,000,000) = $214,000 
for y=3, EW (y, d) —300,000-1- (.0002) (10,000,000) = $302,000 

for y=4, EW (y, d) — $400,000. 


Since under the above assumed conditions the expected loss is smallest 
when y —2, the best ordering policy is to order 2 spares. 

The newspaper vendor’s problem and the Navy inventory problem 
that we have discussed, simple as they are, contain the most important 
elements of all the other problems we shall discuss. We now give a more 
general formulation of essentially the same problem. 

Suppose that at the beginning of a certain time period we have a 
stock, т, of a certain commodity, which we shall call stock before or- 
dering. During the period a certain demand, d, for the commodity will 
be observed. This demand is a chance variable whose probability dis- 
tribution is known to us. The probability distribution of demand may 
depend upon the stock before ordering and the size of our order, but 
once these two quantities are known, the distribution of demand is 
known. To prepare for this demand, we have the privilege of ordering 
more of the commodity from the producer, ordering no more, or re- 
turning some, but this must be done at the beginning of the period. No 
orders can be delivered or stock returned once the period has started. 
We shall assume that there is no time lag in delivery from or to the 
supplier. Our problem is to find the quantity we should order, which 
will be denoted by y—z. Thus y is the quantity on hand at the start 
of the time period but after ordering. А return of goods to the producer 
is а negative order. In all practical cases there will be certain limits on 
the size of orders that can be placed, and our solution of the problem 
will take this into account. Our schedule of losses tells us what our loss 
is for each possible combination of values of the stock before ordering, 

jorder, and demand. In the newspaper vendor's problem and in the 
Navy inventory problem the stock before ordering was zero which is 
- why the т did not appear in the schedule of losses. 
| In general we will choose that size of order such that the expected 
loss is minimized. The size of order that minimizes the expected loss 
will depend upon the size of stock before ordering. An “ordering policy” 
18 a schedule showing what size of order to use for any given size of 
stock before ordering. The ordering policy is the complete solution to 
our problem, for it tells us just what to do in any given circumstances. 

A simple example will illustrate the ideas we have been discussing. 

Suppose the proprietor of a newsstand has z copies of a monthly maga- 
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zine on hand at the end of the 10th of the month, for which he has al- 
ready paid. The wholesale magazine dealer is coming the next morning 
and will take back as many magazines as the vendor desires to return, 
paying the vendor 4 cents each, or he will sell the vendor апу number 
of additional copies at 6 cents each. The vendor charges his customers 
15 cents per copy. The wholesaler will not return until the end of the 
month, at which time he will buy back all unsold magazines at 2 cents 
per copy. We assume that the number of people who will attempt to 
buy a copy from him during the remainder of the month will be either 
4 or 5 with probabilities of 3 and 3 respectively. Also, if the vendor 
has any unsold copies at the end of the month, he will definitely sell 
them back to the wholesaler. What should the ordering policy be in 
this case? The schedule of losses which shall be denoted by W(z, y, d) 
because it depends on the value of 2, on the ordering quantity, y—z, 
and on the demand, d, is obtained in the following way: 

When dZy, y copies will be sold which will bring the уепбог 15у 
cents. In addition, if y Sz, then (с--)) copies are returned to the 
wholesaler which brings the vendor 4(z —y) cents making 15y--4(z — y) 
his total gain, and if y2z, the vendor purchases an additional (y—z) 
copies making his total gain 15y—6(y—z). The negative of these gains 
are the losses, yielding 


Wx, у, а) = — [15y + 4(@ —y)] = — Шу-4 ford=y,ysa 
—[155y—6(y—2)] = — 9y — 62 fodzyyzs. 


When d Sy, we have a situation similar to the above except that d cop- 
les are sold by the vendor instead of y copies and (y—d) copies are re- 
turned to the wholesaler at the end of the month at 2 cents each, yield- 
ing " 


W(x, y, d) = — [15d + 4(z у) + 2(y — 4)] 


= — 18d — 4x + 2y іш4Еу,у 52 
= — [15d — 6(y — 2) +20 — d)] 
= — 18d — 6z + 4y 5 ford Sy,y2 =. 


Now we need the expressions for the expected value of the loss, 
ЕТУ (т, y, d), from which for any given 2, we will be able to find the y 
(hence the order quantity, y—2) which will minimize the expected 
loss. For yx4, the demand is certainly equal to or greater than y, 
and since in this case W(z, y, d) does not depend on the value of d, we 
have 
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EW(z, y, d) = — Пу- 4 їогу<4,у< 
= — 9y — 62 for y S 4, y 22. 
Clearly the expected value of the loss is minimized in this case when y 


is made as large as possible, which is 4. On the other hand, for у> 5, the 
demand is equal to or less than y, and we have 


EW(z, y, d) = — 13[2(4) + 16] — 4x + 2y 
= — 21 — 4; 4 2у fory 2 5,y Ss 
= — 134) + 4(5)] — 6z + 4y 
= — 231 — б: + 4y for y 2 5, y 2 т. 


Here the expected value of the loss is minimized by making y as small 
as possible, which is 5. Hence the ordering policy is certain to call for 
y=4 or y=5, which was fairly obvious anyway from the fact that the 
demand could be only 4 or 5. However, the above expressions for 
EW(s, y, d) are needed to determine when у=4 and when y —5. Sup- 
pose x <4, then the expected loss for y —4 is —36— 6х and for y=5 itis 
—14+— 6x which is greater than —36— 6z ; Бепес the best policy when 
254 із to order up to 4. Now suppose 22:5, then the expected loss for 
0=4 ів —44—45 and for y=5, it is —2$1—4z which is less than —44 
—42; hence the best policy now is to return all those over 5. To sum- 
marize the above, the best policy for the vendor is to buy up to 4 if he 
has less than 4 on hand, or to do nothing if he has 4 or 5, or to return 
any excess over 5 on hand. 

Next we shall discuss a more general problem—the case where there 
are several time intervals with carry-over of stock from one time in- 
terval to the next time interval. Here we are given a certain number 
of time intervals, and stock may be ordered ог returned at the begin- 
ning of any of the intervals, but at no other times. Unused stock at the 
end of an interval may be kept for use in the next interval, with addi- 
tional stock ordered from the supplier if desired, or some or all of it 
may be returned to the supplier. Only the stock available at the begin- 
ning of an interval may be used to supply demand arising in that inter- 
val, and for the present we assume instantaneous delivery of orders 
from the supplier. The total loss is the sum of losses suffered in each of 
the intervals, and the different intervals may have different loss sched- 
ules. Furthermore, the loss in any time interval may depend on the 
whole “past history"— defined as all the stocks, orders, and demands 
in all the preceding intervals—as well as on the stock before ordering, 
order, and demand of the interval itself. Also, the probability distribu- 
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tion of the demand that will be observed in an interval may depend on 
the past history as well as on the stock before ordering and order of the 
interval itself. Once the past history and the stock before ordering and 
order of the interval are known, the probability distribution of the de- 
mand that will be observed in the interval is completely known. Pre- 
sumably we are interested in minimizing the expectation of the present 
value of the total loss, and therefore will apply the proper discounting 
factors to the losses incurred in the various intervals in order to get the 
present values of those losses. These discounting factors can be as- 
sumed to be incorporated into the loss functions for the various inter- 
vals. 

А complete ordering policy in the case of many intervals must specify 
just how large we should make the order at the beginning of each in- ` 
terval in the light of the knowledge we possess at the beginning of the 
interval (i.e. our knowledge of the stocks, orders, and demands in the 
preceding intervals and the stock before ordering of the interval itself). 
In general the order we place at the beginning of any interval will de- 
pend upon the past history, and different past histories will require dif- 
ferent orders. In constructing an ordering policy, it is important to 
remember that once we have reached the beginning of an interval, the 
losses we have suffered in the preceding intervals are now beyond our 
control, so it is only the expected losses from the remaining intervals 
that we worry about. с 

For the case of many intervals, we now give a method of constructing 
an ordering policy that makes the expected loss as small as possible. 
First we specify how much to order at the beginning of the last inter- 
val. This is simple, for there is only one interval left to worry about, 
and we want to make the expected loss in that one interval as small as 
possible. At the beginning of the last interval we know all that has hap- 
pened in the preceding intervals, and therefore we know the Schedule 
of losses and the probability distribution of demand in the last inter- 
val. Thus we have essentially the problem of making the expected loss 
in one interval as small as possible, and we have discussed this prob- 
lem of one interval above. In other words, the problem of how much to 
order at the beginning of the last interval is а simple one-interval prob- 
lem, which we know how to solve. Thus, for all conceivable past his- 
tories we can make up a schedule showing how much to order at the 
beginning of the last interval. Í 

Now we specify how much to order at the beginning of the next-to- 
the-last interval. Once we know the past history before this next-to- 
the-last interval, then for any particular order we place at the begin- 
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ning of the next-to-the-last interval, we can compute the total expected 
loss in the two last intervals. The reason we can do this is that we have 
already specified what order we are going to place at the beginning of 
the last interval, under any conceivable circumstances. The total ex- 
pected loss in the two last intervals will, of course, depend on the order 
placed at the beginning of the next-to-the-last interval, so we simply 
pick that order that makes the total expected loss in the two last inter- 
vals as small as possible. This order will in general depend on the past 
history before the next-to-last interval. So far, then, we have specified 
how much to order at the beginning of the last interval and at the be- 
ginning of the next-to-last interval. А 

Next we specify how much to order at the beginning of the third 
interval from the end. Once we know the past history before that inter- 
val, then for any particular order we place at the beginning of the in- 
terval, we can compute the total expected loss in the last three intervals. 
The reason is that we have already specified how much we will order 
at the beginning of the last two intervals under any conceivable cir- 
cumstances. We place that order at the beginning of the third interval 
from the end that makes the total expected loss in the three last inter- 
vals as small as possible. "Thus, we have specified how much to order 
at the beginning of the last three intervals. 

And so we work our way back, interval by interval, until we have 
specified how much to order at the beginning of the first interval. Once 
we reach this point, our problem is solved, for we know how much to 
order at the beginning of each interval to make the total expected loss 
as small as possible, 

An example with two time intervals will now be given to help clarify 
the discussion just completed. Let us go back to the last example with 
the newsstand proprietor and add another time interval to the problem 
by assuming that the proprietor has two possible ordering times, on the 
mornings of the 1st and 11th of the month. On the Ist he will have no 
stock on hand before ordering, but on the 11th he may have some left 
over from the quantity he purchased on the Ist and failed to sell. Let 
us also assume that all the conditions given in the previous example re- 
main unchanged and that the demand during the 1st interval is either 
10, 14, or 18 with probabilities of 4, 4, 3 respectively, and that the de- 
mand during the 2nd interval is independent of the demand during the 
Ist interval. What we need to determine is how many copies the pro- 
prietor should order on the Ist and what ordering policy he should 
use on the 11th. No doubt there are vendors, particularly those who are 
reluctant to take risks, who would buy only 14 on the Ist in order to 
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avoid the risk of suffering losses from returning unsold magazines to 
the wholesaler. Let us now determine what the best policy really is. 
It will be convenient to introduce the following notation: 
yi =number of magazines purchased on the Ist. 
y2=number of magazines in stock after purchasing on the 11th. 
d; =number of people attempting to buy a magazine during the 1st interval. 
d; number of people attempting to buy a magazine during the 2nd interval. 
2; = number of magazines in stock at the end of the 1st interval. (This is equal 
to yı —d; if yı>dı; otherwise it is 0.) 


To find the best ordering policy, we make believe that we have 
reached the morning of the 11th and therefore know y: and di. We now 
need to find the value of y» which makes the expected loss in the 2nd 
interval as small as possible with the stock on hand being т». But the 
best policy on the 11th has already been worked out in the last example 
of the one interval case, so that policy is the best one to use on the 11th. 
For this 2nd interval we found that y» should be 4 if 2:54 and the 
expected value of the loss is —36— 62», and у» should be 5 if z 5 and 
the expected value of the loss is —181 — 44». All that remains to be found, 
is how many the vendor should buy on the 1st so that the expected 
value of the losses from both intervals is minimized. Clearly he should 
buy at least 14 because he is certain to sell at least 10 during the 18% 
interval and at least 4 during the 2nd interval. Also, he should buy at 
most 18 since the demand during the Ist interval cannot exceed that 
quantity. 

If dizyi then the vendor would sell all y; copies at a profit of 9 
cents each. His schedule of losses during the 1st interval would then 
be 


. Wy, di) = — Mı for dı = yi. 


He would then end the 1st interval with no stock on hand, т-0, and 
the optimal policy on the 11th would be to buy 4 copies giving an ex- 
pected loss of —36—623— —36 for the 2nd interval. Thus the total loss 


from both intervals is 
= 9л- 36 for di > 9. 
If dı y, the vendor would sell only d; copies at 15 cents each and he 
would have bought у: copies at 6 cents each making the loss during the 
1st interval 
(у, di) = — [15d — 65] = — 15% + бл for di € 7. 


He would then end the Ist interval with a stock of 22= —di, and we 
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know that the best policy then would be to make Y2=4 if 15 €4 and to 
make y2=5 if 1225. The expected loss from the 2nd interval in these 
two cases is given by 


— 36 — 6m = — 36 — 6(yı —d) ги - 454 
and 

= Ча — 4m = — ца — 4 d) огу — 25. 
The total loss for both intervals for лай would then be given by 


| = 15d, + би — 36 — 6(y: — d) = — 36 — 9% 
frd Sy 50 +4 


and 


—15d, + бу — 332 — 4(yı — d) = — IH — 11d, + 2y, 
for yı = di + 5. 


We now want to compute the expected value of the total loss for yi 
ranging from 14 to 18 which are the only values we need consider. We 
note that for d —10, we have y:=d,+5 for all the values of у except 
when (1-14 in which case we have diXyiSdy4-4. For di—14, we al- 
ways have di S yis d;--4, and for d,=18, we have yi&d;. Hence the 
expressions for the expected value of the total loss are 


4(— 36 — 9(10)) + 4(— 36 — 9(14)) + 1(— 126 — 36) = — 153 


for yı = 14 
and 
iig — 1100) + 2y,) + #(— 36 — 9(14)) + #(— 91 — 36) 
x AP I for 15 € y < 18. 


` By taking y; —18 we find that the expected total loss is —22$1—1 (18) 

= — 16015; which is the smallest we can make this expected loss. There- 
fore the best policy for the vendor is to buy 18 copies on the Ist and to 
use the policy previously given on the 11th. 

A further generalization of the inventory problem is to allow time 
lags in the delivery of orders. In other words, an order placed at the 
beginning of an interval will not arrive until a certain number, T, of 
intervals have passed. Otherwise the problem is the same as the type 
Just discussed, and the method of solution is almost the same. Obvi- 
ously the last order will be placed (T+1) time intervals before the end, 
since no order placed later than that will arrive in time to be of any use. 
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We choose the size of the last order to make the total expected loss in 
the remaining (T 4-1) intervals as small as possible. The order we choose 
will depend upon the past history known at the moment the order is 
placed, and this past history will include quantities already ordered 
but which will arrive in the future. Once we have found the proper 
size for the last order, for any particular next-to-last order we can com- 
pute the expected loss in the remaining (7-2) intervals. We choose 
the size of the next-to-last order which minimizes the expected loss in 
the remaining (7'4-2) intervals. This size will, of course, depend upon 
the past history known at the moment the order is placed. And во, 
interval by interval, we work our way back to the first order. 

Another generalization is to allow simultaneous demands for several 
different types of items, necessitating the stocking of more than one 
commodity. The demands may be interrelated in any way, and some 
commodities may be partial or complete substitutes for others. It is 
assumed that, given a particular set of demands and a particular set of 
commodities on hand, we know how to use the commodities most ef- 
fectively in trying to satisfy the demands. The schedule of losses in 
this case tells us what our loss is for any given set of demands and any 
given combination of commodities available, assuming that the com- 
modities available are allocated most effectively. In this case the prob- 
ability distribution of demand is & joint probability distribution of the 
different types of demand, which gives us the probability of observing 
any particular combination of demands, and an ordering policy must 
tell how much of each commodity to order at each stage. In computing 
an ordering policy for this case, the principles are the same as in the 
single commodity case, but the details are more troublesome, and for 
a large number of items it may be practically impossible to carry out 
the computations. 

As a last generalization we have the case where the probability dis- 
tribution of demand is not completely known—we may know only that 
the distribution is of a certain type. Then, for any given ordering pol- 
icy, there will not be merely one expected loss, but a whole set of ex- 
pected losses, one for each possible distribution of demand. How then 
shall we compare two different ordering policies, since one may be bet- 
ter for some distributions of demand and worse for others? One method 
of doing this is to find, for each ordering policy, the maximum expected 
loss over all possible distributions, and then choose that ordering policy 
with the smallest maximum expected loss. This ordering policy is called 
а “minimax” policy because it minimizes the maximum expected loss. 

In the following example illustrating the minimax policy, some mathe- 
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matical terminology and methods will be used which may be unfamiliar 
to some readers and can be omitted by them without too much loss. Let 
us go back to the very first example in this paper in which a newspaper 
vendor buys papers in the morning at 3 cents per copy, sells them to 
customers at 5 cents per copy, and resells unsold copies to the supplier 
at 1 cent per copy; but now we shall assume that the distribution of the 
demand, d, is normal, with standard deviation 50 and mean between 
1000 and 2000, the exact value of the mean being unknown. We want 
to find how many papers the vendor should buy in the morning accord- 
ing to the minimax policy. 
As before, the vendor's schedule of losses is given by 
Wy, d) = 2y — 4d ford Sy 
i = — 2у for d = y. 
From this it is seen that for any order quantity, y, the vendor's loss 
will be greatest when d is smallest. But as the mean of the normal dis- 
tribution of d decreases, small values of d become more probable and 
large values of d become less probable. "Therefore, for any given y, the 
expected loss is greatest when the mean of the normal distribution of d 
is as small as possible, namely, 1000. Hence, if we choose the y that 
minimizes the expected loss when the mean of the distribution of d is 
1000, this will be the minimax y (the y with the smallest maximum ex- 
pected loss). For suppose we use some other y, then the maximum ex- 
pected loss using this other y will occur when the mean of the distribu- 
tion of d is 1000, and this maximum will be greater than if we had used 
the y that minimizes the expected loss when the mean of the distribu- 
tion of d is 1000. To find the minimax y, let F(x) denote the normal 
cumulative probability distribution function with mean 1000 and 
standard deviation 50, and f(z) the corresponding density function. 
Then the expected loss (assuming the mean of d is 1000) is equal to 


f in 
2y — CTS F(y) — 2y[1 — F(y)] 


= —2у— 1 [елда + 4yF (y). 


Differentiating with respect to y, we get [-2--4F(y) ]. This derivative 
is zero for F(y) — 1/2, negative for F(y) «1/2, positive for F(y) » 1/2. 
Therefore we should take y so that F(y) is equal to 1/2, which means y 
should be equal to 1000. This is the minimax y. 
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CENSUS TRACTS AND URBAN RESEARCH* 


Donatp L. Foury 
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ELECTED statistics have been reported on a census tract basis by the 

Census Bureau for the past four decennial censuses. The number of 
tracted cities has increased during this period from 10 to 72. In short, 
the census tract statistical reporting system has become a well developed 
source of information. 

At recent Census Tract Conferences most of the discussion has cen- 
tered on applied uses of tract data. Thus representatives from business, 
market research [1], city planning and various social and health agen- 
cies have reported on putting census tracts to work. This paper supple- 
ments these earlier reports (1) by examining how census tract statistics 
have facilitated urban research of & more theoretical sort, (2) by dis- 
cussing some methodological problems that have been encountered, 
and (3) by suggesting ways in which census tracts can most effectively 
implement such research in the future. The focus here will tend toward 
pure rather than applied, and toward university rather than business or 
civic agency research. 


THE USE OF TRACT STATISTICS IN “PURE” URBAN RESEARCH 


In general, the tract reports issued by the Census Bureau have cen- 
tered around certain population and housing characteristics, areally 
assigned according to home address. Each category of information has 
usually been reported in frequency distribution form, from which se- 
lected summary statistical measures (e.g., percentages or averages) can 
be computed. 

Research use of tracts is by no means limited to data reported by the 
Census Bureau. This is one of the intriguing assets of the tract report- 
ing system. Numerous and important additional types of information 
have been assembled by local agencies and researchers, although prob- 
ably more for applied than for pure research purposes [2, 3]. Thus, we 
have had tract statistics for juvenile delinquency |4, 5], receipt of wel- 
fare care (5, 6], births and deaths |5, 7], illness [5], mental illness [8], 
suicide [9], residential mobility [6, 10], etc. EYE 

So much for an introductory look. Let us now turn to university re- 
Search. In which academic fields have tract data been used in the con- 
duct of pure research? In general, the research most directly promoted 


Chicago, December 28, 1952. At the time of presentation the author was affiliated with the University 
of Rochester. 
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has been that dealing with the differential characteristics of urban resi- 
dential subareas within large cities, being conveniently subsumed under 
the label, human ecology [11]. Urban sociologists, urban geographers, 
and land and real estate economists have been the most active devotees, 
while various other social scientists and scattered professionals with 
pure research interests, in such fields as municipal administration and 
business, have been peripherally linked. 

Discouraging as it may seem to proponents of the census tract sys- 
iem, & sober appraisal leaves the impression that there has been but 
limited pure research use of tract statistics and this mainly in the field 
of urban sociology. Urban geographers have relied on their own map- 
ping and descriptive skills and have generally shunned the compara- 
tive, quantitative methodology that would most logically provide a 
receptive context for using census tract data. Some social scientists, 
notably certain real estate economists, have placed greater reliance on 
census tabulations by city blocks than by census tracts [12]. In political 
science and in other branches of economics there seems to have been 
virtually no research use of the census tract system. 

What research patterns have been employed in adapting tract ma- 
terial to pure research use? An initial distinction here is between those 
studies where census tract statistics have provided the central data and 
those researches where tract figures have been used (although less spec- 
tacularly) in the selection of study districts [13] or in furnishing statis- 
tics of relatively minor importance. 

Tt would seem fruitful to identify six main ways in which tract sta- 
tistics have been used, viewed in methodological terms. These different 
patterns are not mutually exclusive; two or more may be interwoven 
within the same study. 

1. Descriptive use in which the differential incidence, by tract, of a 
single factor is reported. In this pattern the incidence variations can 
usually be conveniently summarized in map form [2], using what we 
may term an ecological map. Some comprehensive reports have in- 
cluded a series of such maps, reporting both census collected and 10- 
cally assembled statistics. Among the most ambitious of such reports 
are those for Minneapolis [14], Seattle [15], Cleveland [7], and Rochester 
[5, 16]. In some studies of very large cities, tracts have been combined 
to form concentric zones or sectors, with incidence rates reported ac- 
cordingly [4, 8, 17]. 

3 2. Descriptive use in which the cross-cutting of two or more separate 
"incidence patterns is reported. In map form, this use involves either the 
comparison of two or more of the single factor maps, as prepared in use 
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(1), or the preparation of a single map in which the cross classification 
of factors is shown with the aid of an appropriate legend [2, 18]. This 
step characteristically precedes the somewhat more sophisticated uses 
(4), (5), or (6). 

8. Time-series use in which changes, by tracts, are reported for stated 
periods of years. It is common here to summarize the findings in map 
form with legends indicating percentage increases or with time-series 
graphs fitted into the various tracts. Where statistics have been re- 
ported by concentric zones for some of the largest cities, various other 
forms of graphic analysis have been used, as in the Chicago studies of 
population succession by Cressey and Ford [19, 20]. 

4, Analysis of relationships, utilizing what has been termed ecological 
correlation [21]. Here the variables are summary measures, by census 
tracts. Thus, one can correlate per cent foreign born and median school 
years completed. In this case nativity status and education are not 
correlated directly, person by person, as in individual correlation. Usu- 
ally, in fact, in ecological correlation we do not know this information 
on a person to person basis. Studies of this type have been conducted 
in Chicago [4, 8, 22], St. Louis [6, 23], and other cities [5, 24]. 

5. The interpretation of individuals! characteristics 4n terms of the 
general social environment of the tract. In this case the former emerge 
from the specific study while the latter is available in the form of pre- 
viously published tract statistics. Faris and Dunham [8], for example, 
utilized this design to demonstrate that mental illness rates were higher 
for Negroes and certain other groupings in areas (combinations of tracts) 
not primarily populated by their own members. 

6. Use in statistical index form, each index presumed to represent а 
cluster of factors. Thus, average rental [7] and median education [25] 
have been promoted as indices of socio-economic status. A challenging 
recent attempt to develop statistical indices is the work by Shevky and 
Williams using Los Angeles census tract data [18] in developing three 
indices: for social rank (roughly socio-economic status), for urbaniza- 
tion (a complex of factors relating to type of family life), and for segre- 
gation (the residential concentration of minority groups). Based on the 
alternate ways in which these three indices can be related to each 
other, the authors have suggested a typology of residential areas. Alter- 
nate segregation indices have also been suggested by other researchers 
[26, 27, 28]. Kendall and Lazarsfeld have presented a stimulating dis- 
cussion of the various types of indices usable at a tract level according 
to the alternative logical ways by which they relate to direct charac- 
terizations of the individuals included [29, pp. 187-196]. 


736 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1958 
SOME METHODOLOGICAL PROBLEMS 


The research use of census tract data has involved a series of ecologi- 
cal and statistical assumptions, some of which have been reexamined 
during recent years. It should be recognized that the most vigorous ex- 
pansion of the census tract system tended to coincide with the phe- 
nomenal rise of the ecological “school” of social research. By now, the 
sociologist’s intellectual honeymoon with urban ecology is over and he 
is faced with the problem of settling down and living with this ecologi- 
cal approach. Such scholars of ecology as Hollingshead and Hawley [30] 
have in recent years identified theoretical difficulties inherent in “clas- 
sical” ecology and have indicated considerable skepticism regarding 
the future utility of spatial analysis, narrowly conceived. 

Let us examine some of the more specific assumptions that have been 
implicit in ecological research using census tract data: 

1, It was assumed during urban ecology’s early years that the large 
city was divided into “natural areas.” There was some belief that cen- 
sus tracts could be so established that they would coincide closely with 
these natural areas. Thus, internal homogeneity was sought and as- 
sumed for each tract |2, 31, 32]. The utility of the natural area concept 
has since been questioned by Hatt [33, 34] and the usefulness of data 
from non-homogeneous tracts has been challenged by a number of re- 
searchers [15, Appendix B; 35]. Myers, for example, in a recent study 
concluded that of New Haven’s 28 census tracts “10 tracts are homo- 

- geneous to a remarkable extent; seven are less homogeneous; while the 
remaining 11 are heterogeneous” [36]. 

.2. With Burgess’ important concentric zone construct, it appeared 
likely that general principles of urban spatial patterning would emerge, 
embodied in an ecological theory of urban structure. It has become in- 
creasingly evident, however, that at best alternative constructs must 
be admitted, such as the Hoyt sector theory and Davie’s insistence on 
the industrial pattern’s primacy. A more pessimistic view concludes 
that for many cities historical or topographic factors have had so per- 
vasive an influence as seriously to limit the predictive value of the 
broader principles. So while some cities (Chicago, St. Louis, Rochester 

Я 187]) tend to uphold much of Burgess’ and/or Hoyt’s theories, other 
cities (Boston [38], Pittsburgh, New York, Flint [39]) have more com- 
plex patterns. We may eventually need to introduce a typological sys- 
tem such as Shevky’s that will be less geared to grand principles and 
more to identifying certain types of urban areas in whatever overall 
pattern they take, Where Burgess’ scheme has not proved applicable, 
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the use of concentric mile zones, gradients, and similar ecological tech- 
niques tend to lose much of their utility. - 

3. In many research projects it has been assumed that ecological in- 
dices were valid measures of certain social phenomena. In the study of 
juvenile delinquency, for example, the number of boys brought before 
a juvenile court or other agency (and expressed as a rate) has been used 
as an index of delinquency [4]. In the 1940 Census enumeration, the 
number or per cent of dwelling units “needing major repair” was avail- 
able as an index of housing condition.! In 1950 a “dilapidated” cate- 
gory was introduced. But we have had relatively little systematic vali- 
dation of these indices. The work by Schmid in this connection is im- 
portant. Using 1940 tract statisties from 20 medium-sized cities, he 
examined the degree to which a single index, such as educational level 
or rental level, is a valid measure of а larger complex of factors [25]. 

4. There has been some tendency for researchers rather uncritically 
to accept census tract statistics as reliable. There are conditions, how- 
ever, under which one should recognize that a sampling error may be 
present, particularly where the population base for the tract is small. 
This problem was recognized rather early in the development of the 
census tract system by various statisticians |40, 41, 42, 43], but it is not 
certain that all other users of tract statistics have heeded the cautions. 
Now in the 1950 census tract reports the problem has been reopened by 
the Census’ reliance on a 20 per cent sample for some nine published 
tract tabulations, This has resulted in such potentially important sta- 
tistical indices as years of schooling and family income now being sub- 
ject to sampling error.? ij 

5. In the impressive series of studies that have used ecological cor- 
relation it has been assumed that correlations demonstrated meaning- 
ful interrelations of factors. Certain scholars іп the 1930s 4, 45] and а 
recent vigorous article by Robinson [21] have pointed to serious statis- 
tical difficulties implicit in ecological correlation. Robinson concludes 
that ^. . . the only reasonable assumption is that an ecological correla- 
tion is almost certainly not equal to its corresponding individual cor- 
relation. [21, p. 357]. These critics have thus shown not only that eco- 
logical correlations run higher than individual correlations, but that 
the fewer the ecological areas, the higher the correlations. Hence а cor- 
226 lewer the ecologicnt ани ынын л ы ы сш к о 

! Ав а matter of fact, this index did not prove to be consistently valid when used in research ín 
St Tei Th vus apparent oat о, луу 
Schmid’s methodological research tended to validate the use of median school years completed аз ап 
important index [25], we now find that in the 1050 census reports the utility of this statistical indicator 
18 somewhat reduced. 
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relation for City X based on 20 large districts will turn out to be larger 
than one based on 85 smaller districts, say, census tracts. Menzel, on 
the other hand has pleaded the case for ecological correlation where it 
is clearly understood that the characteristics being correlated are 
meaningfully interpretable in areal as well as (or instead of) in individ- 
ual terms [46]. 

6. In most earlier ecological research a rather static approach was 
assumed. Hence, to study urban structure it was necessary to have 
statistics that related only to the individual in his tract of residence at 
the time of the enumeration. No information, for example, was pro- 
vided on home-work or home-shopping spatial relations. With rare 
exceptions (Cleveland statistics for several years during the 1930s 
[10]), we have lacked information on intertract residential mobility 
within a metropolitan region. We have had no usable information as to 
residents’ association memberships and psychological identifications. 


PROMISING FUTURE USES OF CENSUS TRACT DATA 


It now seems appropriate to summarize what appear to be some of 
the most fruitful continuing uses for census tract statistics in pure ur- 
ban research. For in spite of the skeptical tone in which a number of 
the above points have been phrased, it is apparent that the tract re- 
porting system, when judiciously utilized, fills a striking need and cer- 
tainly deserves to be maintained. It is far more economical for Census 
reporting and more convenient for a variety of research applications 
than is reporting on a block basis. For the largest cities it provides a 
workable unit by which statistics can also be assembled for even larger 
areas or districts. 

An initial recommendation is that researchers in such academic fields 
as geography, political science, and social psychology be “educated” to 
the potential research adaptations of the census tract system. For ex- 
ample, the author recently overheard a political scientist admitting 
ignorance of census tract data, when questioned by a fellow sociologist. 
Nor had this political scientist heard of the recent study by Salmon and 
Olds of St. Louis voting behavior [23]. This scholar in the field of po- 
litical behavior showed considerable interest in the fact that tract sta- 
tistics could often be combined into ward statistics making possible 

- ecological correlations between voting behavior and various social 
characteristics, 

A second suggestion is that census tract data may have their great- 
est general research value in providing rough ecological profiles. It 
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would be misleading to promise too much in the name of census tract 
statistics. They do not, for example, offer the areal refinement available 
from block data. Then, too, tract statistics are typically encumbered 
with certain limitations inherent in their functioning as statistical in- 
dices. There would seem to be a continuing need for guidance to poten- 
tial users. 

A third proposal flows from the second: that the use of tract statistics 
be integrated with other research approaches. An analysis of tract in- 
formation or the plotting of tract statistics on work maps can some- 
times be helpful at exploratory levels of research. Statistical profiles of 
particular sections of a city provide an excellent backdrop for non- 
quantitative case-study types of analyses. In Stephan’s words (dating 
from the mid 1930s), “Census tract research will probably be most ef- 
fective when considered not as a method of study complete in itself 
but as one step in a sequence of investigations” [39, p. 166 Suppl.]. 

Fourth, ecological correlations should be used only if it is clearly un- 
derstood that they tend to relate characteristics of areal units and that 
they are not adequate substitutes for individual correlation. If, for ex- 
ample, a researcher wants to study the correlates of the incidence of 
mental illness, he should recognize the methodological alternative of 
directly exploring the background characteristics of persons who are ill. 
Тһе researcher should also take into account the effect of tract or areal 
unit size on the magnitude of the resulting ecological correlation. 

Fifth, there is a continuing need for ingenuity in introducing new 
types and forms of tract information. At the University of Miami, 
Wolff has been developing a technique for forecasting population by 
census tracts [47]. With many of the largest cities now having a back- 
ground of three or four decades of census statistics, such analysis of in- 
ternal population trends may become increasingly feasible. i 

Under the sponsorship of the Social Science Research Council, the 
Pacific Coast Committee on Community Studies (Leonard Broom, 
Chairman) is currently preparing a research memorandum that will 
include several methodological contributions [48]. Schmid has been re- 
fining an approach whereby the Guttman scaling technique may be ap- 
plied to census tract data in an attempt to type residential areas. 


Robinson, Broom, Shevky, and Bell have all been engaged in further 
ogies. These researches will be in- 


developing and testing areal typol 
cluded in the Committee’s memorandum. One other recent West Coast 
Wann's research at 


attempt at developing urban subcultural areas is 
the University of California [49]. 
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There would seem to be a continuing need for more measures, on а. 


tract basis, of residential mobility and various cummuting and activity 
patterns. The brilliant theoretical study of residential mobility by 
Stouffer [50] was only possible because of Green's unique assembly of 
intertract residence shifts [10]. If it were possible to replicate these 
statistics in other cities or to devise similar cross tabulations, by tracts, 
on home-to-work or on home-to-other-activity movements, our under- 
standing of daily population movements and of dependency on com- 
munity facilities could be enhanced. The coding of certain information 
by tract of employment or of shopping might be a helpful variant. 
And, finally, it seems appropriate to stress the need within each large 
city for effective communication among researchers so as to maximize 
7 the chances that data and methods from one study will have by-prod- 
uct value for succeeding studies. Base maps, street indexes, and certain 
arrangements for filing and interchanging data should be provided. 
The highly ingenious punched card system developed for St. Louis by 
Olds [51], although built around the city block as the basic unit, шау 
have certain applicability on а tract or a block-and-tract basis for other 
cities. 
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ON A PROBABILITY MECHANISM TO ATTAIN AN 
ECONOMIC BALANCE BETWEEN THE RE- 
SULTANT ERROR OF RESPONSE AND 
THE BIAS OF NONRESPONSE 


У. Epwarps емізе 
New York University 


The author postulates a probability mechanism for the 
simultaneous production of the bias of nonresponse and for the 
variance of response, The nonresponse arises from a graded 
series of classes of the members of the universe to be 
sampled. The classes range from an impregnable core of no 
possible response, on up to a class of complete response. 
Nonresponse arises from two sources, not at home, and re- 
fusal. Refusals are of two kinds, permanent and temporary. 
The variation in the amount of time spent at home, and the 
variation in the firmness of the temporary refusal, produce the 
graded series of classes. The bias of nonresponse arises from 
the variation of any characteristic from one class to another. 
The variance of response arises from the variation of any 
characteristics from one member to another within a single 
class, and from the random variation in the number of re- 
sponses therefrom, 

An increase in the size of the initial sample or a more 
efficient method of selection will decrease the variance of 
response, but will have no effect on the bias of nonresponse. 
Successive recalls, on the other hand, decrease the bias of 
response, and are more effective than an increase in the size 
of the sample or a more efficient method of selection in de- 
creasing the root-mean-square error which arises from both 
nonresponse and from the variation of response. 

The results show that without recalls, it is hazardous to 
put any confidence in the result, no matter how big the sample, 
even when the variation in the measured characteristic is only 
two-fold from the class of lowest response to the class of 
highest response. 

With the levels of response assumed here (taken from aver- 
age urban experience), and with an estimate formed by 
summing up the initial call and the recalls, the first two recalls 
effect together about a 50% reduction in the initial bias of 
nonresponse. Further recalls continue to be productive. In 
fact, with this method of estimation, each recall added to a 
sampling plan, even to six recalls, actually increases the 
amount of information obtained for each dollar expended on 
interviewing. 

Even with three recalls, and with only a two-fold variation 
from the class of lowest response to the class of highest re- 
sponse, an initial sample bigger than the equivalent of from 
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300 to 500 binomial cases in any one subclass is ineffective and 
uneconomical. The apparent precision of a bigger sample is a 
delusion, as with bigger samples the bias of nonresponse will 
eclipse the error of sampling unless there are 4, 5, or more 
recalls. An attempted “complete count” is no exception and 
often represents an extreme waste of effort. 

For high accuracy, a plan that uses the ordinary method of 
estimation by combining the initial attempt and the recalls 
must support 4, 5, or 6 recalls, along with an initial sample 
equivalent to from 800 to 1500 binomial cases. 

For any proposed survey, calculations based on rough ad- 
vance estimates of the constants that appear in the formulas 
will predict to a useful degree of approximation the biases and 
the variances to be expected from various types of plans. Fig- 
ures on costs will then point out which plan is most economical, 
of those that are possible, for the attainment of a prescribed 
accuracy. 

Where extremely high accuracy is required, the Politz plan 
with 2000 or more binomial cases becomes competitive in cost 
with a survey that depends on recalls. In any case, the 
Politz plan has the advantage of speed and of being able to 
produce results under circumstances wherein recalls are im- 
possible (for example, listening to a radio program). 

The proposed mechanism provides a theory of bias to sup- 
plement the theory of sampling. It indicates the possibility of 
new and more efficient methods of estimation than the simple 
combination of the initial attempt and the recalls, as it will 
provide a rational basis for extracting more information from 
the recalls, It will also point out, for any particular method 
of estimation, what empirical information will be helpful in the 
planning of the efficient allocation of effort amongst the initial 
sample and the recalls. 


THE AIM OF THIS RESEARCH’ 


Ne in a survey is devastating and discouraging, whether 
) the survey be by mail or by interview. In careful survey-practice, 
efforts have been made in many directions to reduce it. One usual 
solution is to ind ways to build up the initial response. An additional 
solution is to call on the nonresponses, and to call and call. The first 
recorded systematic plan for putting pressure on a sample of nonre- 
, Spondents appears to have been carried out by Maurice Leven! in 1934. 
‘Substitution does not help: it is only equivalent to building up the 
size of the initial sample, leaving the bias of nonresponse undiminished, 


1 Maurice Leven, The Incomes of Physicians (Chi: z 
саго, 1932); pp. 12 and 13. Mr. Stanley Legergo! 
Feo an ана to this work: With tegard fo tne шейк conces a кары ио, тес, 
ог example, Cochran, Sampling Techniques (Wiley, 1953), p. 302, 
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Hansen and Hurwitz? found the optimum fraction of recalls to reach г 
the minimum sampling variance for a fixed total cost, the estimate being 
formed, as usual, by pooling the initial responses and the recalls, Birn- 
baum and Sirken* apparently sought to minimize the mean square 
error that arises from both the variance of the responses and from 
failure to obtain interviews for any reason, including the permanent 
refusals (this I judge from their *nonresponding groups responding yes" 
—not clear to this author). Houseman! presented new results on the 
total bias that may arise from different classes of nonresponse. А new 
approach, in surveys conducted by interviews, is the Politz plan,’ 
in which only the temporary refusals require recalls, as the correction 
for people not found at home while the interviewer is in the area is 
made by classifying the respondents according to the chance of finding 
them at home, and then by weighting the responses accordingly. 

It turns out that the bias of nonresponse is probably so serious in 
many if not most surveys that the specification of the number of recalls, 
and the adjustment of the original size of the sample to permit either 
the use of the Politz plan or the requisite number of recalls to balance 
the bias of nonresponse against the variance, and to stay within the 
allowable budget, are an essential part of sample-design where the aim 
is to produce as much information as possible per unit cost. 

The purpose of this paper is (a) to study the evidence produced by 
8 proposed mechanism that will give rise to а caleulable variance, to 
а calculable bias of nonresponse, and to а calculable cost; (b) on the 
basis of this mechanism to make a determination of the number of re- 
calls that are required to reach a desired accuracy at minimum cost. 
The allocation of the effort between the initial sample and the recalls 
is as important as the usual theory for calculating a sample-size. 


? Morris Н. Hansen and William М. Нага, “The problem of nonresponee in sample surveys 
Journal of the American Statistical Association, vol. 41, (1946), pp. 517-29. Pert, 
? Z. W. Birnbaum and Monroe. Sirken, “Bias due to nonavailability in sampling surveys,” Me ж 
9f the American Statistical Association, vol. 45 (1950), рр. 98-111. “Оп the total error ues im 
view and to random sampling,” International Journal of Opinion and Attitude Research, vol. 4: ag 
02, Cochran in bis Sampling Techniques (Wiley, 1958) gives on page 290 an excellent summary 
aum and Sirken’s results. Agricultural 
Ri ‘Earl E. Houseman, “Statistical treatment of the nonresponse problem,” sid 
‘esearch, vol. у; (1953), pp. 12-19. i 

* The Politz plan was under discussion ns early as 1945 in conversations. ЖОМЕ ЗӨ Ж diy 
this author. Experimental work thereon commenced in 1946 in the Alfred Politz reseai Bern ud 
іп which the weighting became routine through various simplifying procedures. ends 
Application were presented in a joint article by Alfred Politz and Willard R. Simmons, “Ал attempt to 
get the not-at-homes into the sample without call-backs,” Journal of the American 
tion, vol. 44 (1949), рр. 9-31. 

Н. О. Hartley described what is essentially the Politz idea in a discussion of Lupi Ls 
read by Yates at a meeting in London (see Frank Yates, Још we um ither accom] lished 
six (1946), p. 37 in particular), but Hartley made no mention of experimental work either accomp 
or intended. 
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A further purpose of the paper is to compare the results and the costs 
of recalls with the alternative Politz plan. 


CRITERION FOR THE OPTIMUM PLAN 


We now define the root-mean-square error. The criterion to be 
adopted here for the optimum plan is that it shall deliver’a prescribed 
mean square error at minimum cost. The root-mean-square error (to be 
abbreviated r-m-s error hereafter) of any plan of survey will by defini- 
tion denote the hypotenuse of a right triangle, one leg of which is the 
bias of the nonresponse that arises from the plan, and the other leg of 
which is the standard error of the plan (see Fig. 1). Different plans 
will have different triangles. By definition, the criterion for the opti- 
mum plan is that it shall give a shorter hypotenuse than any other plan 
will give for the same cost; or, alternately, a plan is optimum if it, 
among all possible plans, will deliver а prescribed length of hypotenuse 
at the lowest cost. One plan is “better” than another if it will yield а 


The standard error 
of response 


The bias of nonresponse 


Fieure 1. Any plan of survey will possess a bias of nonresponse and а standard 
error of response. The right angle addition of the two forms the root-mean-square 
error of the particular plan. 


shorter hypotenuse than the other, for the same cost. There are 8 
number of nonsampling errors in all surveys, whether complete ог 
sample.* The bias of nonresponse is only one of them. It exists, of course, 
in complete counts as well as in samples. In fact, the conclusions to 
be reached at the end will point to some drastic re-orientation of the 
effort expended on complete counts. Both the bias of nonresponse and 
the error of sampling exist in sample surveys. These are the two errors 
that within any particular framework of design of sampling, inter- 
Viewing, and questioning, are direct functions of the size of the sample 
and of the number of recalls. 

Gu 0 LOTO UON marum iM айыы у дё 


$ A list of such errors with discussion is contained in Chapter 2 of Deming's Some Theory of. Sampling 


MD D) and in an article entitled “On errors in surveys,” American Sociological Review, іх 
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Ав one seldom knows the resultant magnitude of all the non- 
sampling errors, and as they vary from one survey to another, the most 
sensible magnitude to aim at for the r-m-s error of the combination of 
sampling and of nonresponse (the hypotenuse in Fig. 1) will vary like- 
wise. One might aim at a r-m-s error of 7% in one survey, at 10% in 
another, and at 20% in another. Even with unlimited expenditure to 
reduce the r-m-s error to very low proportions, other errors will still 
be present unless funds are diverted to reduce them also. 


QUANTIFICATION OF THE PROBLEM 


“Тһе probability mechanism or model will now be described. The 
population to be sampled will be divided into six classes, according to 
the average proportion of interviews that will be completed success- 
fully out of 8 attempts. The classes will be designated by 0, 1, 2, 4, 
6, 8 to denote 0, 1, 2, 4, 6, 8 interviews completed, on the average, out 
of 8 attempts. These figures will often appear as subscripts to various 
other symbols. Six classes will be sufficient: more classes would not alter 
Ме results enough to warrant the extra labor. 

We assume that under the conditions specified for any particular 
survey, failure to obtain an interview may arise from a multitude of 
causes, which are manifest as not at home and refusal. We assume that 
people that refuse are of two kinds, those that give permanent refusals 
and those that give temporary refusals. People that give permanent 
refusals will never respond to any kind of treatment (they are a part 
of Class 0 defined more explicitly later). People that give temporary 
refusals are the kind that will refuse sometimes but will grant inter- 
views at other times or to other interviewers. An example of a tem- 
porary refusal is a case where the wrong interviewer called, or the right 
one called at the wrong time—woman bathing the baby, indisposed, 
family at dinner, ete. An interview might have been obtained with 
better luck in timing, or better luck in the selection of the interviewer. 

Class 0 contains the stubborn core of permanent impregnable re- 
fusals, plus the people who are never at home, gone to Florida, etc., 
or who are drunk when you do finally find them, or who turn out to 
be incapacitated otherwise and can not possibly give meaningful an- 
Swers. At this moment we may note that the magnitude of this class 
varies widely, dependent on the type of information called for by n 
Survey, and on the procedure of getting it. In а census, when peop Г 
ате away, or refuse, or are incapable of giving information, а goo 
Share of the required information can usually be obtained from neigh- 
bors, and is, although information on income must usually in such 
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cases be left unanswered. Thus Class 0 in a count of the numb 
inhabitants only is doubtless well below 1%, being reduced Ьу 
cooperation of neighbors. But in surveys whose express purp 
income, expenditures, savings, medical history, the neighbors are 
able to help, and Class 0 is bigger. I assume it to be 5% in the 
tions to be presented here. 

At the other extreme is Class 8, the people who 8 times out of 8 
at home and answer the questions. Moving inward from the soft 0 
shell (Class 8) toward the impregnable core, we encounter la; 
increasing density. In Classes 6, 4, 2, 1 are the temporary refusals 
the people who are not home all the time. In Class 6 an intervie 
be successful at finding the respondent at home and in getting anin 
view, on the average, 6 times out of 8; in Class 4, 4 times out of 

, Thus, we have not merely responding units and nonresponding ut 
Neither have we merely an overall proportion of response nor of 
response, but rather, response and (except for Class 8) nonres 
from each of several classes. We have not а mean value of some ch 
acteristic for the responses and some other value for the nonresp‘ 
instead, each class possesses a mean and a variance. We are conc 
with the cumulative results from all classes, 


THE PATIENT MEAN 
We define the “patient mean” as 


| 8 8 
> Dia; ра 
1 


а* = = 


В 
Чо 


wherein a; is the mean value per sampling unit of some part 
characteristic (rent, number of people employed, or something 
‚ № Class (, and p; is the proportion in this class. The patient mean 
be the datum from which we reckon the biases in later calcula! 
and the unit in which we shall measure the bias and the roo 
Square error of any plan. It is the result of calling back patiently 
infinitum on all the people in Classes 1, 2, 4, 6, 8. The members 
Jlass 0 will also be included in the recall because in practice we bi 
no way of separating them out; but as they yield no response, 
contribute nothing to the patient mean. 
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THE INITIAL SAMPLE (ATTEMPT I) 


The treatment will be simplified by the assumption that the initial 
sample is the mere drawing of n names from a list of № names (the 
frame). А more complex plan will cause no important modification in 
the conclusions with respect to the necessity for recalls, nor with 
respect to the number of recalls required for the most economical plan. 
It will not modify seriously the comparison with the Politz plan. It 
will, however, change the absolute figures on cost, but these are not the 
aim of this study; they are auxiliary only. By further assumption the 
frame will be so large compared with the sample that the multinomial 
term 

n! i 
= — ppp" + + + pss (2) 
no!ny!na! + + + ng! j 
gives the probability that in the initial sample (Attempt I), there will 
be n; names in Class 7. n is the size of the initial sample. л; is a random 
variable; p; and n are constants, satisfying the equations 


$us -т (8) 
p =1, (4) 


If the sample (n) is-as great as 10 per cent or more of the frame, 
the variances and the biases to be computed should be reduced ap- 
proximately by the factor 1—n/N, in practice this reduction will be of 
hegligible importance. 

When the returns from the initial call come in, we form from them 
the numerical average for some particular characteristic and denote 
it by z(I). According to ће particular mechanism postulated, the com- 
Position of x(I) will be the fraction Й 
Sum of all the numerical values in the responses of Attempt I (B 


(Т) = 
Y Number of responses in Attempt I 
If we were able to separate the returns by class, this would appear as 


У Ван H d hereafter, sums will run over all classes 
ш) e XU bust unless indicated otherwise] (ба) 


Wherein R; represents the number of responses from Class $, and z; 
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represents the mean of the R; responses. Both R; and 2; arerandom 
variables. Their expected values are 


Ez; = а; (6) 
ER; = Пт: (7) 
where 
2 
=! в) 
SR 
The variance of x; will be 
4 1- ipi 
Varna ( + LM. (9) 
Tipi TDi 


wherein о; is the standard deviation of the particular characteristic 
in Class 2. In what follows we shall drop terms in 1/n?; hence we shall 
have no further use for the term (1— z;p;) /nz;p; in the last equation. 

The quantity x(I) in Equation 5 is a random variable. Under the 
assumed probability its expected value will be 


G 
Е()- a (10) 
and its variance will be 
Var (I) = 2. DY рее + [а — Е(Т)]?}, an 
where for convenience 
G= У іра: " (12) 
Н = У) ёр. (13) 


The derivation of Equations 10 and 11 is simple in the light of 
certain well-known principles of sampling. Let each sampling unit 
possess 8 cells , each one NR or R (NR for no response, R for response) 
according to the following distribution: 


Class0, 8NR, OR 
Class1, 7NR, 1R 
Class2, 6NR, 2R 
Class4, 4NR, 4R 
Class 6, 2NR, 6R 
Class 8, ONR, 8R 


p” 
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Now when we draw a sample, we in effect draw first а sampling unit, 
which will belong to one of the above classes. Next, we draw 1 of its 
8 cells at random to determine whether we get a response. If we draw 
ап R-cell (a response), we write down the number z;;, which will be а 
random variable, the same for all the R-cells of an individual, but vary- 
ing from one individual to another. If we draw an N R-cell (no response), 
we make no record at all. The probability of getting а response in the 
double drawing (first, an individual sampling unit; second, a cell) is 
Tipi, Which is only the expected proportion of all the responses that 
will fall in Class 7. 

The mean of the entire set of responses in the frame will be 

nga 6 (14) 


Es Tra H 


and their variance will be? 


ilo? poly 
re У тр] Te un]. (18) 
d 


The double drawing is à random procedure in which each cell has 
the same probability as any other in the entire frame. The mean of 
the returns of а sample will therefore give ап unbiased estimate of 
the mean of the entire set of responses; but this is only а restatement 
of Equation 10. The expected number of responses in a sample of n is 
путь, wherefore the variance of a sample of n will be very closely 
equal to c?/n }У тр; but this is only a restatement of Equation 11. 
And thus Equations 10 and 11 are established.* ; 

The bias in the expected result E(I) of Attempt I will be defined as 


ва) = Еа) — a*. (16) 
The mean square error of 2(1) will then be 
Mse (D = Var (I) + ВІ). (17) 


If Figure 1 were drawn for Attempt I, the two terms on the right id T 
equation would be the squares of the two legs of the triangle, and the 
left-hand member would be the square of the hypotenuse. 

Е ee LEE 


^ 
? This is the formula for the variance of a composite universe; 866, for example, the author's Some 


Theory of Sampling (John Wiley & Sons. 1950), pp. 58 and 59. МЕ 
"My Be Benjamin ХЕ Tepping discovered this simple way of deriving Equations 10 
and 11. He furnished also algebraic proofs, but they seem not to be required. 
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ATTEMPTS II, III, IV 


The nonresponses left over from the first attempt form a new frame, 
The sampling plan may prescribe 0, 1, 2, or more recalls on a sample of 
these nonresponses. 

The Ist recall will be identified here as Attempt II. The 2d and 3d 
recalls will be Attempts III and IV. 

The determination of the optimum fraction (y) of the nonresponses 
of Attempt I to draw for recall will be а subject for investigation in à 
later paragraph. 

The bias of nonresponse arises from Classes 1, 2, 4, 6. Each successive 
attempt digs deeper into the lower classes, and diminishes the relative 
proportions that remain in the upper classes. Class 8 is in fact wiped 
out in Attempt I. In this way the combination of successive attempts 
pushes the accumulated result closer and closer to the patient mean а”, 

We assume that each attempt picks up à random sample of the non- 
responses in each class. This is not what happens, but it is probably 
impossible to put down an equation for what actually happens. The 
interviewers use ingenuity. They find out from neighbors when the 
people now absent will be at home. They make Observations; they make 
appointments. They hold conferences to decide which one of them 
might best succeed in breaking down a refusal. Working for and working 
against the interviewers is some softening and also some hardening 
of the hearts of people who refused at an earlier call. I have seen them 
both. The net result is probably that the recalls are less costly (88 
Houseman says) than I assume in Table 3, and more successful than 
this theory indicates. If 80, then the recommendations for recalls are 
even stronger than one may conclude from this theory alone. 

Equations 10 and 11 apply also for the results of Attempts II, 
ІШ, IV, if n is treated in any attempt as the number of interviews at- 
tempted, and if p; in Equations 10-13 is replaced by: 


(1-я) pi/ © (1 — тур Attempt II 
(1—7)! »/55 (1 — т;)?р{ Attempt IIT 
(1—7,)8 2425 (1 — т;)2р;. Attempt IV 


Class 8 contributes nothing to these sums, being wiped out by the fac- 
tor 1—z; which is 0 when ;—8. 
EQUATIONS FOR THE COMBINATION OF ATTEMPTS 


If the plan of survey calls for two recalls, we combine the results of 
Attempts I, П, ТП. With an obvious extension of notation, the result 
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of this combination will be ; 5 
г(І--П--Ш) = шд(Ї) + ше (П) + unal), (18) 


where Ur, un, Um are weights. If Ry, Еп, Rm are the responses in the 
three separate attempts, then 


Ет, Ru, Em 
В Ёп + Ёш. 
For the expected value of 2(14-П--Ш) we may write with sufficient 
approximation 
| BQ + I + Ш = шЕ() + wE) + ана Ш), (20) 


wherein wr, wr, wm, are the expectations of Ur, un, иш. Formally, with 
sufficient approximation, 


E ip, Хір(1- 72, Хір(1- т)! SUN 
Apt - =) + — 9] 
Before proceeding, we note that 
ur + Un + un = ) | (22) 
wr + On + Win = 1 
The bias of «(I+-II+III) of the combined results of Attempts I, II, 
III will be defined as i 
BU + I + II) = EQ + I + Ш — а. (23) . 
The variance of z(IJ-II--III) may be computed as ) 
Var (I + I + HI) = wi Var (I) + ши? Var (IT) 
+ wm? Var (III). (24) 
The notation in the above equations can easily be extended or con- 
tracted to more or to fewer attempts. For a plan that uses only one re- 
call, we simply drop the symbol Ш; also the term (1—9* in Equa- 
tion 21. For a plan that uses three recalls, we annex a term in IV, and 
replace (1—z;)? by (1—73)*4-(1— 7)*- 
THE POLITZ PLAN 


The Politz plan includes questions to inquire of each person found 
at home, and who does not refuse, to ascertain whether he was at home 
last night at this time, the night before last, etc., to cover the 5 nights 
preceding the interview, 6 nights in all. Each return is given 8 weight 


(19) 


Ur, Unt, шп = 


wi, тт, Vin = 
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104, the reciprocal of the number of nights at home over the period of 6 
successive nights. The result of applying the Politz plan will be the 
random variable 

8... 


Sw,R, 


wherein S denotes the sum over the 6 Politz classes, and wherein Ё, 
and т, denote the number of responses and their mean value in the 
Politz class t. w:=6/(1+t), where t is the number of nights at home 
during the preceding 5 nights. w: R;, and т, are, all random variables. 

Tn each class except Class 8 it is possible for a person to be at home, 
during the preceding 5 nights, some number of nights other than his 
average (;). Thus, E ш, is not the reciprocal of Ti, but takes the values 
shown in Equation 29. By applying the formula 


z(P) (25) 


E 
E~ = — (1 +0, = 00.0) (26) 
v Ev 


it is possible to find the expected value of z(P) and to show that the 
Politz correction for not being at home leads to the bias 


В(Р) = Ex(P) — а* [Definition | 
1 
= ap — at — (G = BL 52 (rip) (B; — A?) (a; — ар) 


T = » TipiD (a0; SET 2 . Q7) 
The terms in the braces are very small numerically, and we accept 
with sufficient approximation for our purpose, 
B(P) = ap — a* (28) 
wherein т;=7/8, as heretofore, and 


6 = . 
о» 1+1 [For Class 7] 
: 2S 5 Assuming that ¢ is а 
jo x l-ct ( a ni end | йоны variate | 
18/6 
2 ) (1 — rj) 
Ti ам \ 3 


2 
-—[1-G-z), (29) 
Ti 
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6 2 
В; = Ew? = (= 
1+{ 


ЕЕ 


==> “(Әп уне 


Ti зы 8 
6 (1/6 1/6 
pad bus Ms ш увда РИТ 
СН (i)e la 2076 dis 
1 
tec) (30) 
6 
к ЕЯ. В ить 
Я: 
Dy TipiÀ Qi »5 pill Б 5 тӘЧа 
- = , (81) 
У Tipid; Ур: - 0-2] 
У = УЕ: TipiA;. (32) 
The bias of a plan that uses k—1 recalls may be written 
[1 — (1 = ri) las 
Epl- A- e (33) 


BI-K)= 
f i E pli - a — т 
to the same approximation that appears in Equation 28. With k=2, 
for example, this form gives a numerical verification of the bias 
В(1-ЕП-ЕШ) calculated otherwise by Equations 20 and 23. 

The variance of the Politz plan is® 


Var (P) = > wipiBilo? + (a: — аһ 


nV? 
ТА 
2\2 у (rp:)?(B: - АӘ(>ш-ай- (80) 
+ (i - Sy E cora - 45 — on 


= 1, the second term vanishes, 


It is worth noting that if we place 4;— B; s 
ly to Equation 15, as it 


and the right-hand member reduces precise 
should. 


My equation for the Polits plan differs from the equations given by Polits and Simmons. 
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SEARCH FOR THE OPTIMUM PLAN 


The accumulated mean square error of Attempts I, II, and III will 
be 


M(I-4-II--IID) = Var (+ +I) +B + II + III). (35) 


We drop the symbol III for a plan that calls for two attempts, and we 
annex IV for a plan that calls for four attempts. 

Any two plans may be expected to incur different costs and to yield 
different mean square errors. Ав agreed at the beginning, a plan is 
optimum if its cost is less than that of any other plan that will yield 
ihe same mean square error. This is а matter of numerical calculation. 

Numerical assignments to the various fundamental magnitudes (pi, 
аг, о) will occur two sections ahead. 

We have one other task—the determination of the optimum frac- 
tion y, а subject for the next section. 


DETERMINATION OF THE OPTIMUM FRACTION OF NONRESPONSES 
TO INCLUDE IN THE RECALLS 

Let y denote the fraction sought. We remind the reader that At- 
tempt III will be a canvass of all the nonresponses that remain from 
Attempt IL and that Attempt IV will be а canvass of all the non- 
responses that remain from Attempt III. There is thus only the one 
Íraction y to determine. 

The mean square error (M) of the accumulated result of any num- 
ber of attempts may be written in terms of n and y as 


M = A + B/n + C/ny, (36) 
the cost of which is 
У = Dn + Eny, (37) 


4, B, C, D, E are constants. As before, n is the initial sample for At- 
tempt I. By differentiation it can be shown that, for a fixed value of Y, 
the minimum in M occurs when 


Кс. (88) 


This result is independent of n, hence it holds for any initial size of 
sample. 

The equation for y? just given contains D and E only in the ratio 
D:E, which shows that y does not depend directly on the absolute 
magnitudes to be assumed for the costs in Attempt I and later, buf 


un dni zl, 
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rather on the ratio of these costs. And as y will be proportional to 
VD:E, y is relatively insensitive to the ratio assumed for D: E. More- 
over, y is not dependent on the absolute magnitudes of the a;, but on 
their ratio to any one of them, or to a*, because B and C occur only in 
the ratio B:C. 

Table 4 shows the optimum values of y obtained from Equation 38; 
also the values selected for actual use in the calculations. The fraction 
y obviously varies slowly with the number of recalls. To simplify the 
required calculations I have set y=3/5 for all plans with the first set 
of а;; and у= 1/4 for all plans with the second set of а. 

It may be of interest to note that the removal of the bias of non- 
response by recalls is independent of the fraction y. 1% is not necessary 
to recall on the optimum fraction, nor on any other particular frac- 
tion, so far as the bias of nonresponse is concerned. However, as y de- 
creases, the cost goes down but the variance and the r-m-s error in- 
crease, во it is wise not to make y too small. The optimum fraction, if 
it can be predicted on experience, or some approximation thereto, will 
guide one close to the minimum r-m-s error for any permissible cost of 
interviewing. 


NUMERICAL MAGNITUDES ASSUMED 


In order to make numerical calculations and to derive conclusions 
therefrom with respect to the most economical design of surveys, it is 
necessary to assume some numerical magnitudes for the фу, с; also for 
the costs. Unfortunately, no set of numerical magnitudes can be typi- 
cal of all conditions met in the field. I may interject the reminder that 
every question on a questionnaire has not only its own particular val- 
ues of a; and of еу, but of p; as well, even within the same survey, be- 
cause some questions receive better cooperation than others. The best 


` that one can do is to make numerical assumptions that fit some of the 


conditions met in practice, and to infer from the equations the range 


of validity of the conclusions. 

The basic numerical assumptions are in Table 1. The expected num- 
her of interviews, of responses, and of nonresponses, are shown in Ta- 
ble 2. The response rates (the р.) assumed here аге intended to assimi- 
late average urban experience on & question of moderate difficulty ;and 
Without making them responsible for the final choice, I wish to thank 
Messrs. Lester В. Frankel and Robert Weller of the Alfred Politz Re- 
Search organization for their help and interest in choosing these par- 
ticular values. К, : 

Е ortunately, there isa great deal more generality in the two sets of 
а: than may be apparent at first sight, for one may transform either one 


158 
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TABLE 1 
NUMERICAL VALUES ASSUMED 
Class 
Property and 
symbol ПА ks tigre в в 1-8 
Proportion pi .05 10 510 .20 .25 .80 .95 
Mean value 
of the а; (1st) | xxx | 2.00 1.75 1.50 1.25 1.00 || а*=1.355 208. 
measured С 
character- |а; (24) | xxx | .10 .20 .40 .60 1.00 || a*=0.589 474 
istic ] 
Standard deviation 
s; XXX Same as a; in both sets of a; 
TABLE 2 


THE EXPECTED SIZES OF SAMPLE IN THE VARIOUS AT- 
TEMPTS, BASED ON AN INITIAL SAMPLE OF n IN 
ATTEMPT I. HERE THE SUMS RUN OVER 
ALL CLASSES, 0 TO 8 


Attempt Interviews Responses Nonresponses 
I n n) ripi "У(1-тәр: 
п тп-туХ (1-тдр ny Х(1-тдтұр; ny 2, (1 —72*pi 
ш nmeny31—-)p: | пути: | ny а -кдчи 
IV ту-ту (1-т)%; | пуў(1—т%х{р; ny 2, —mi) ‘ps 
M пу=пу (1-тф | пуу (1—тдр; | туу (1-т/%0ы 
VI ту-ту 2,(1 — 7p; ny 2. —mi) rips ny 2,0 —7)'pi 
уп түп-ту 2 (1-т0%; | ny у(1—хд%лр; | ny L- r) pi 
Numerical values based on an initial sample of n =1000 
I n=1000 625.0 375.0 
II тп =875.0у 126.6y 248.4y 
TII Тип —248.4y 60.3y 188.1y 
IV тту —188.1y 34.4y 153.7y 
v ny =153.7y 22.2y 131.5у 
VI пут =131.5у 15.6y 115.9y 
уп Түп =115.9у 11.7y 104.2y 
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of these sets into almost any other that he may encounter.-For ex- 
ample, to discuss a yes-and-no survey in which the proportions of yes 
vary from 60% in Class 1 to 40% in Class 8, one has only to derive a 
‘new value a;' from an old a; by setting 


а/ = 20 + 20a; (39) 
where a; on the right belongs to the Ist set of a; in Table 1. Both a; 
and a;' —20 have a 2-fold variation from Class 1 to Class 8. The new 
patient mean is 

a/* = 20 + 20a* (40) 


where а* = 1.355 263, the patient mean of the Ist set of a;, as given in 
Table 1. The relative bias computed for a;’—20, for any number of at- 
tempts, will be precisely the same as the relative bias computed for a; 
(Table 5). It follows that the new expected value for any number of 
attempts will be 


Е’ = 20a* Rel B + а* 
= 47.105 + 27.105 Rel В (41) 


where Rel B is the relative bias shown in Table 5 for the corresponding 
number of attempts. An example will occur later (Table 9). 

The 2d set of a; could serve the same purpose by a suitable transfor- 
mation, but we shall not carry it through. т 

Thus, in spite of the limitations of any particular set of numerical 
assumptions, the conclusions to be drawn will warrant some sweeping 
generalizations. 

COSTS 


For the costs of making calls (interviewing only) we assume for cal- 
culation the following figures: 


For Attempt I, $3 per call 
For later attempts, $5 per call 
For the Politz plan, $4 per name 


This amount will cover the cost of weighting 
and of calling back on the temporary refusals, 

Table 3 shows the costs of interviewing derived from the values as- 
sumed for the p; in Table 1, and with the cost per call аз mentioned 
earlier. n is the size of the initial sample, and y is the fraction of the non- 
responses left over from Attempt I that constitute the sample for At- 
tempt II. 
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TABLE 3 
COSTS OF INTERVIEWING 


Plan No. of recalls Cost (dollars) 
Attempt I 0 Y=3n 
Attempts I--II .8750лу 3n 4-1.8750ny 
Attempts I — III .6234ny 3n +3 .1172ny 
Attempts I-IV .8115ny 3n 4-4.0576ny 
-Attempts I— V .9653ny 3n4-4.8263ny 
Attempts I—VI 1.0968ny 3n4-5.4839ny 
Attempts I — VII 1.2126пу 3n 4-6.0632ny 
Politz (equivalent to 5 recalls) 4n 


The actual numerical magnitudes of these costs are not so important 
as their relative magnitudes. If all the costs were doubled, the cost 
computed for any plan will be doubled, but the relative costs and the 
relative merits of the various plans would remain unchanged. 


TABLE 4 


RESULTS FOR THE OPTIMUM y, AND THE VALUES 
SELECTED FOR THE CALCULATIONS THAT LED 
TO TABLES 5 AND 6, AND TO FIGS. 2 AND 3 


186 set of a; 2d set of a; 
VT enc] ОЦ оа aaa 
Plan y calculated | yselected | у calculated | y selected 
from for from for 


Equation 38 | calculation | Equation38 | calculation 


I-II 69 3:5 1:4 
I-III 67 3:5 1:4 
I-IV 65 3:5 1:4 
I-V 63 3:5 1:4 
I-VI 61 3:5 1:4 
I-VII 60 3:5 1:4 


It should be noted that these costs are for the interviewing only: 
Considerations ОЁ overhead costs, training, and office-work for the 
different plans must be taken into account before one decides definitely 
whether one plan is more economical than another. 


CONCLUSIONS FROM THE CALCULATIONS 


,. The numerical results of the calculations are in Tables 5, 6, 7, 8 and 
in Figs. 2 and 3. The biases and r-m-s errors are expressed in units of 
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TABLE 5 


NUMERICAL VALUES OF THE BIASES AND R-M-8 ERRORS FOR 
VARIOUS SIZES OF INITIAL SAMPLE (n); 1st SET OF в), 
1 7.6. COSTS AT n —1000 


Plan I ї+п 

Rel bias —.110874 | —.075752 
n=100 

Rel m-s-e .025974 .019918 

Rel r-m-s-e -161164 ‚141131 
n=200 

Rel m-à-e .019133 .012828 

Rel r-m-s-e .138322 .113261 
n=300 

Rel m-s-e .016853 010464 

Rel r-m-s-e ‚129822 ‚102294 
п =500 

Rel m-s-e .015029 008574 

Rel r-m-s-e .122593 .092596 
n=1000 

Rel m-s-e 013661 -007156 
, Relr-m-s-e .116880 .084593 
л =2000 

Rel m-s-e .012977 006447 

Rel r-m-s-e „113920 .080293 
n=3000 

Rel m-s-e :012749 -006211 

Rel r-m-s-e .112911 -078810 
n=5000 

Rel m-s-e +012567 ‚006022 

Rel r-m-s-e .112103 .077602 

-- 

Costs at 

n=1000 $3000 4125 
cL | Ж юр a А ү салла Шаба Е. 


а*. The base for the bias is the 0-point of the scale for the аг. The esti- 
mation is assumed to be a summation of the initial call and the dap 
The aim is assumed to be the estimation of an average or of a total. 


Y ji to аз: 
А. Conclusions from the 1st set of ai, а 23010 variation from as 
Table 5 and Fig. 2. Conclusions 1, 2, 8, 4, and 65 are independent of the 


type and size of sample. ч 
1. With no recalls at all (Attempt I only), the minimum relative 
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r-m-s error attainable is 11%. No sample however big, not even a com- 
plete count, can penetrate below this minimum, without recalls. 


30% | 


25% 


20% 


$ 

Poux 

8 

u 

= Rese (1) 

E ioc -80 

g Rmse (111) 
-BUL+ II) 
=Rmse (1-111) 


=8 (1-11) 


$4000 


S 
8 
© 


COST OF INTERVIEWING 


500 1000 
п, SIZE~ OF* INITIAL: SAMPLE 


Figure 2. The relative bias, the relativer-m-s error, and thecost, plotted against 
the initial sample-size (n) for various plans, for the 1st set of ai, in which a;—2 0%. 
The curves show the futility of attempting to achieve accuracy by sheer size of 
sample. Recalls are much more effective. The dashed lines show the size of 
sample required, and the cost, to yield a relative r-m-s error of 74%. The relative 
biases and the relative r-m-s errors are in units of a*. 


2. With one recall (Attempts I+II), the minimum r-m-s error drops 
to 7.676. No sample however big can penetrate below this minimum 
with only one recall. 


94у 
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3. With 2 recalls (Attempts I--ILJ-III), the minimum r-m-s error 
drops to 5.7%. No sample however big can penetrate below this mini- 
mum with only two recalls. 

4. With 3 recalls (Attempts I-IV), the minimum r-m-s error drops 
to 4.5%. With 4, 5, and 6 recalls, the minimum r-m-s error drops to 3.7, 
3.0, and 2.5%. 

5. To attain a prescribed r-m-s error of (e.g.) 7396: 


(а) We may use 3, 4, 5, or 6 recalls with initial samples as shown in the ac- 
companying table. 


From Fie. 2 
No. of recalls Initial sample -Cost 
6 345 $2,290 
5 378 2,390 
4 408 2,450 
3 512 2,800 


(b) With 0, 1, or 2 recalls we can not attain the prescribed r-m-s error (74%) 
with any sample however big. 


B. Conclusions from the 2d set of ai, а 10-fold variation from a to ав: 
Table 6 and Fig. 8. Conclusions 6, 7,8, 9, and 10b are independent of the 
lype and, size of sample. 


6. With no recalls at all (Attempt I only), the minimum r-m-s error 
attainable is 24.5%. No sample however big, not even а complete 
count, can penetrate below this minimum without recalls. 

7. With one recall (Attempt I+II), the minimum r-m-s error drops 
to 15.5%. No sample however big can penetrate below this minimum 
with only one recall. i» 

8. With 2 recalls (Attempt І--11-- 1), the minimum r-m-s error 
drops to 11.3%. No sample however big can penetrate below this mini- 
mum with only two recalls. Д 

9. With 3 recalls (Attempts I-IV), the minimum r-m-s error drops 
to 8.7%. With 4, 5, and 6 recalls, the minimum r-m-s error drops to 
6.9, 5.6, and 4.7%. 

10. To attain a prescribed r-m-s error of (e.g-) 10%: 


(8) We may use 3, 4, 5, or 6 recalls with initial samples as shown in the ac- 
companying table. 
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From Fie. 3 
No. of recalls Initial sample Cost 
6 210 $ 875 
5 245 1,010 
4 325 1,380 
3 730 2,910 


(b) With 0, 1, or 2 recalls we can not achieve the prescribed r-m-s error (10%) 
with any sample however big. 


TABLE 6 


NUMERICAL VALUES OF THE BIASES AND R-M-S ERRORS FOR 
VARIOUS SIZES OF INITIAL SAMPLE (п); 24 SET OF а, 
У-.25. COSTS AT n=1000 


` Plan I Ізі I-HI | I-IV 1-У I-VI | I-VI 

Rel bias .245190 | .155062 | .112605 | .086955 | .069408 | .056593 | .046815 
n=100 ? 

Rel m-s-e .091085 | .046455 | .032012 | .025385 | .021763 | .019563 | .018185 

Rel r-m-s-e .303290 | .215534 | .178919 | .159327 | .147523 | .139868 | .134666 
12200 

Rel m-s-e :076052 | .035250 | .022353 | .016473 | .013290 | .011383 | .010164 

Rel r-m-s-e -275775 | .187750 | .149509 | .128347 | .115282 | .106691 | .100817 
n=300 

Rel m-s-e 070741 | .031515 | .019133 | .013502 | .010466 | .008656 | .007507 

Rel r-m-s-e +265972 | .177525 | .138322 | .116198 | .102303 | .093038 | .086043 
п=500 

Rel m-s-e +066492 | .028527 | .016558 | .011125 | .008207 | .006475 | .005381 

Rel r-m--e .257860 | .168899 | .128078 | .105475 | .090592 | .080467 | .073355 
51000 

Rel ш-в-е .063308 | .026286 | .014626 | .009348 | .006512 | .004839 | .003787 

Rel r-m-s-e ‚251607 | .162130 | .120938 | .096659 | .080697 | .069563 | .061539 
n=2000 

Rel m-s-e +061712 | .025166 | .013660 | .008451 | .005665 | .004021 | .002900 

Rel r-m-s-e -248419 | .158638 | .116876 | .091929 | .075266 | .063411 | .054681 

^ n28000 

Rel m-s-e +061181 | .024792 | .013338 | .008154 | .005383 | .003748 | .002724 

Rel r-m-s-e -247348 | .157455 | .115490 | .090299 | .073369 | .061221 | .052192 
n=5000 

Rel m-s-e -060756 | .024493 | .018080 | .007917 | .005157 | .003530 | .002512 

Rel r-m--e ‚246487 | .156502| .114368 | .088978 | .071812 | .059414 | .050120 

reete Е T 

Costs at : 

n=1000 $3000 3469 3779 4014 4207 4371 4516 
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Figure 3. The relative bias, the relative r-m-s error, 
Against the initial sample-size (n) for various plans, 
47.1 ав. The curves show the futility of attempti 
sheer size of sample. Recalls are much more effective. T] 


size of sample required, and the cost, to yield 


SAMPLE 


and the cost, plotted 
for the 2d set of a;, in which 
ing to achieve accuracy by 
he dashed lines show the 
a relative r-m-s error of 10%. 


The relative biases and the relative r-m-s errors are in units of a*. 


C. General conclusions 


11. Even with three recalls, with the level of response assumed in the 


caleulations (taken from average urban 


experience), a sample bigger 


than the binomial equivalent of from 300 to 500 for an estimate of any 


One class is ineffective and uneconomic 


теа] benefit from bigger samples must support 4 or 


al. A plan that would reap any 


5 or more recalls, 
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12. An attempted “complete count” is no exception, and often rep- 
resents an extreme waste of effort. 

13. With the proportions of nonresponse assumed here, high ac- 
curacy can be attained only with 4, 5, or 6 recalls, along with an initial 
sample equivalent to from 800 to 1500 binomial cases. Careful con- 
sideration should therefore be given in the planning to decide whether 
the need for extreme accuracy warrants the required expense and delay 
occasioned by recalls beyond the 3d, and for an initial sample bigger 
than the binomial equivalent of n=300 in any subclass of the universe 
for which an estimate is desired. 

14. Table 8 shows that where extremely high accuracy is required, 
the Politz plan with 2000 or more binomial cases becomes competitive 
in cost with a survey that depends on recalls, In any case, the Politz 
plan has the advantage of speed, and of being able to produce results 
under circumstances wherein recalls are impossible. 

15. Because one kind of experience may be translated into another 
by transformations similar to Equation 39, the generality of the above 
conclusions and their impact on the design and interpretation of sur- 
veys and of complete counts are inescapable. A limiting case of excep- 
tion occurs, of course, when the range of variation of the a; is small 
compared with а”, 

16. The above conclusions with respect to the number of recalls re- 
quired are generally applicable to all types of sample-design for draw- 
ing the sampling units. A change in sample-design (as from the bi- 
nomial sampling of individuals to samples of areas) only changes (usu- 
ally widens) the distance from the bias to the r-m-s error in Figs. 2 and 
3, without raising or lowering the bias. The most economical number (n) 
of interviews in an area sample, for any given number of recalls, will 
for most characteristics be bigger than the figures mentioned in con- 
clusions 11, 13, and 14. The increase may range from 0 on up to some- 
times double, depending on the characteristic and the clustering effect 
of the interviewers’ workloads. 


IMPACT ON DESIGN 


The most impressive feature of the results is the heavy bias of non- 
Tesponse, when no provision is made to reduce it, even though there 
be but a 2-fold variation from a; to аз. 

The second most impressive feature is the fact that if nonresponse 
reaches anywhere near the proportions (p;) assumed, then when the 0 
of the scale of the a; is not large, we can not afford, except for special 
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justification, to plan for extreme accuracy: it is simply too expensive. 

This conclusion is also borne out by Table 7, which shows that more 
information per dollar comes from a sample of 500 than for а sample 
of 1000; and that every successive recall shows a gain in the amount of 
information obtained per dollar, particularly for the smaller sample 
size. An optimum is not reached even with six recalls. In other words, 
as we concluded earlier from Figs. 2 and 3 and from Tables 5 and 6, we 
get more for our money by taking а moderate initial sample and dig- 
ging deep into it with many recalls. However, many recalls delay the 
day on which the tabulations will be ready, and one may be forced to 


TABLE 7 


THE AMOUNT OF INFORMATION PER UNIT COST FOR THE SEVEN 

PLANS (FROM 0 TO 6 RECALLS), FOR INITIAL SAMPLES OF 500 AND 

1000. INFORMATION IS DEFINED A8 THE RECIPROCAL OF THE 
REL M-S-EIN TABLES 5 AND 6. THE COST COMES FROM TABLE 3 


1st set of a; 2d веб of a; 

Plan 

n =500 n=1000 n=500 n =1000 
I .044 358 .024 400 -005 013 .005 265 
EPIL .056 562 .033 877 .020 216 .010 966 
I-III .066 744 .043 522 .031 954 .018 092 
TD .074 326 .052 524 .044 787 .026 664 
TY, .079 422 .060 315 .057 912 .036 501 
І-УІ 082 438 .066 519 .070 649 .047 278 
I-VII .083 856 .071 127 .082 302 .058 472 


call a halt at 3 or 4 recalls. Where speed is urgent, or where recalls are 
otherwise inadvisable, one may bear in mind the Politz plan, which of- 
fers а rapid solution with recalls only on the temporary refusals. 
With the usual method of estimation (pooling the initial call and the 
recalls) the best way to attain accuracy is to build up the initial re- 
sponse (ie., to increase ps). One or two recalls would then be much 
more effective than they are under the conditions assumed; and bigger 
samples would also be more effective. Observations on the proper time 
of day to find certain kinds of people at home in a particular area, and 
willing to answer questions, plus a skillful introduction and approach 
80 as to cut down refusals, are known to be helpful in this direction. 
An attempted complete count is no exception to the conclusions 
reached, Without a highly successful initial response, followed by some 
effective number of recalls, 95% of the energy put into a complete 


| 
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count, taken to obtain an estimate for a large area, may be wasted. 
Size does not atone for nonresponse: this is all too evident from the 
calculations (Tables 5 and 6; Figs. 2 and 3). 

Тһе mechanism adopted here is à device by which experience can 
be aceumulated and pointed toward the attainment of (8) greater ac- 
curacy per unit cost, and (b) less waste, through conservation of un- 
productive effort expended on samples that are too big. Good guesses 
for the constants р;, аз, c; can almost always be made on the basis of 
past experience; and the calculations made with them will indicate a 
plan not far from optimum, Continued experience will provide im- 
proved numerical values for the constants, and continually improved 
design and interpretation of the results. Without а probability design 
of some sort, it is difficult to capitalize on experience. 

Although the discourse here has been entirely in terms of interviews, 
the results are equally applicable to surveys in which the initial at- 
tempt is made by mail, or in which all attempts are made by mail. 
Appropriate changes must of course be made in the numerical values of 
the constants. Thus, if the mail were used for Attempt I, and if inter- 
views were used for the recalls, then the cost D in Equation 38 would 
be much less than it is when interviews are used in Attempt I, and y 
will then be smaller. For example, if the cost of a mailed questionnaire 
were $.75, and if the cost of an interview on 8 nonresponse were $5, 
then y would reduce to perhaps as low a figure as 1 in 6, depending of 
course on the other constants in the equation. 

One may well wonder what the biases are in surveys that depend 
only on a mailed survey with a 15% total response, or even 80% or 
50%, without calls on the nonresponses. The mechanism adopted here 
shows that it is a mystery how such results can be worth anything at 
all. 


IMPACT ON METHODS OF ESTIMATION 


After the returns from the survey are in, there remains the problem 
of estimating the mean per sampling unit, and the standard error of 
this estimate. As the survey does not touch Class 0, it can by itself 
only produce estimates for Classes 1-6. ) 

Тһе usual practice of combining the various attempts (after weight- 
ing Attempt II and higher attempts by the factor 1/y) may be both 
misleading and inefficient. A glance at Table 5 or at Figure 2 shows that 
41%, of the bias still remains after the 3d recall, and that 27% still 
remains after the 5th. Table 6 and Figure 3 are equally discouraging. 
The decreasingly slow ascent toward the vertex of 0 bias may explain 
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how easy it is to conclude, incorrectly, that after 3 recalls there is little " 
more bias to squeeze out, and that additional recalls are not worth their 
cost. 

To illustrate the usual procedure, let us make some calculations on 
a yes and no survey.’ The proportions of yes іп the various classes will 
range, let us suppose, from 60% in Class 1 down to 40% іп Class 8, 
following the relative values of 20(a;-+1) derived from the 1st set of а; 
in Table 1. Table 9, caleulated with the aid of Equation 41, shows the 
expected results of combining 2 attempts, 3 attempts, etc. Тһе result 
that we really need is the patient mean, shown at the bottom of the ta- 
ble as the expected result of continuing the recalls indefinitely. The 


slow progress of the combined result is obvious; also the need of some- 
thing better. 


TABLE 9 


THE EXPECTED PROPORTIONS OF YES, FOR SEVERAL PLANS, 
COMPUTED BY EQUATION 41. THE PROPORTIONS OF YES 
RANGE FROM 60% IN CLASS 1 TO 40% IN CLASS 8 


н Expected proportion Bias 

Yes No remaining 

Attempt I 44.10 55.90 100.0% 
ІП 45.05 54.95 68.4 
I-II-HII 45.55 54.45 51.8 
I-IV 45.88 54.12 40.9 
I- V 46.11 53.89 33.2 
I- VI 46.28 53.72 27.6 
I-VII 40.42 + 53.58 22.8 

Infinity a’* 247.11 52.89 0 


What we need is a way to extract more information from the recalls. 
А more efficient estimate may be contained їп a scheme for extrapolat- 
ing the results of the various attempts, as proposed by Hendricks." 
The mechanism proposed here will provide a rational scale for the 
extrapolation. It may be that the scale proposed by Hendricks is ap- 


ДҮ Iam indebted to Dr. Leo P. Crespi and to Mr, Fred W. Trembour of the Reactions Analysis 
Staff in the Office of the High Commissioner for Germany, who in several conversations with the author 
brought up questions and suggestions that led to this illustration. 

1 Walter A. Hendricks, Chapter 5in the book Agricultural Estimating and Reporting Service (Misetl- 


ee No. 703, Bureau of Agricultural Economies, Washington, 1949); pages 31-95 in 
р A 


— 
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propriate, or it may be that some other scale will give more accurate 
results with convenience. 

For an estimate of a* by extrapolation, we may look upon recalls as 
necessary to provide the required coordinates of points by which to 
make the extrapolation, and not merely to provide additional returns 
to add to the initial attempt. ; 

For this new type of estimate, the standard error would not. be cal- 
eulated in the usual way (Equation 24), but as the standard error 
of the intercept on the scale along which we read, by extrapolation, the 


estimate of a*. New theory will be required for the optimum allocation 


of effort amongst the various recalls, and for effecting the extrapola- 
tion; also for calculating its standard error. It may turn out, for ex- 
ample, that unless one can achieve extremely high initial response, 
approaching 90%, there may be little point in expending funds to build 
it up. It is possible that theory beyond the scope of this paper may lead 
to efficiency and reliability far beyond those attained in practice today. 


SOME REMARKS ON CLASS 0 


We must face the fact that our survey can at best only provide esti- 
mates for Classes 1-6, although it can also give us the proportion po 
and some of the characteristics of Class 0. The administrative decisions 
that the survey was expected to help may nevertheless involve Class 0 
along with the others. In a marketing study, for example, the people in 
this class may be heavy purchasers of the very commodity that forms 
the subject of the survey. They may in part be people who travel much, 
and who may thus be important to a railway, an air line, a manufac- 
turer of automobiles, a hotel, and to others. They may be people in 
high income groups. I may therefore be important to learn how much 
we are missing by not bringing them into the survey. 

Unfortunately, it is impossible to learn this magnitude from the sur- 
vey itself. The only possible approach seems to be from outside Sources, 
Such as through statistics on the total movement of a particular product 
from wholesale into retail stores. It is possible in many cases to Med 
outside evidence by which to evaluate approximately the magnitu е 
of ao (the mean in Class 0), or rather of the total aopo in Class 0, for 


some of the important characteristics that affect the а. jt Soo 
tot is to ascribe upper and lower bounds to the pos- 
ee г ble effects of Class 0 


sible magnitude of aopo, and thus to infer the possi 
оп the uses and limitations of the data." 


12 This suggestion came from Professor Philip M. Hauser in ап 
this research, 


informal conversation in regard to 
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The difficulty with Class 0 is not peculiarly a sampling problem, as 
Class 0 appears in complete counts as well as in samples—in fact, it is 
undoubtedly bigger in complete counts than in samples. 
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EFFECT OF WEIGHTING BY CARD-DUPLICATION 
ON THE EFFICIENCY OF SURVEY RESULTS 


Твутма RosHwaLB 
Opinion Research Corporation 


UnvEY designs often specify that weights be applied to certain 
groups of observations as part of the estimation procedures. These 
estimation procedures are designed as complements of sampling in- 


‘structions which may, for example, specify methods for eliminating 


call-backs,! ог the disproportionate allocation of the sample to the sev- 
eral strata.? A third example of the need for weighting may be taken 
from the fact that the denominator of the sampling fraction, 1/k is, fre- 
quently a non-integral divisor of the size of the population to which the 
fraction is applied. This means that the number of cases obtained is 
very often not equal to the number of cases desired, and that some 
weighting adjustments may thus be called for. The subject of this note 
is the specific problem of the effect of non-integral weights (e.g., 1.2, 
1.75, 8.4, etc.) on the sampling variability of the survey results. The 
two weighting procedures described below, arithmetic and card-duplica- 
tion, give identical results when the weights are integral. This identity 
disappears when the weights are non-integral, for card-duplication 
involves sampling the returned questionnaires for reproduction. A dis- 
cussion of the problem for the case of stratified sampling, using dispro- 
portionate allocation, offers a useful solution. i 

When a sample design calls for the disproportionate allocation of the 
sample to the several strata, we have seen that there are two alternative 
estimation procedures 


(a) Arithmetic weighting: If Wi is the proporti 
ith stratum and 4; is the sample estimate of 
then 2 2» Wi; is an unbiased estimate of 

(b) алай weighting: If the results of the survey can be recorded 
on punch-cards, then it is possible to weight the № yale in mH 
stratum to their proper weight in the sample by drawing a ran m. ien 
of n; cards from the original № cards во that the total number o M di 
the stratum, (N;--n9, is equal to W.(N-+n), where (N +n) is { dh | 
number of cards in all strata, including both the original and the duplica 


cards, 
1A. Polits, and W. Simmons, “An attempt to get the er into the sample without call- 
backs,” Journal of the American Statistical Association, 44 (1949), ЖЫ 
IWE. T ae гегу of Sampling. New York: Jobn Wiley and Sons, 1050, р. 215. 
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on of the population in the 
the mean in the ith stratum, 
the population mean, и (i=l, 
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In the case of arithmetic weighting, the variance of 2 is known to be 
Үаг 2 = У) W: Var &;. (1) 
i=l 


In the case of card-duplication, the variance of the sample estimate 
of the mean may contain an additional component of variation due to 
the sample of n; cards drawn from the original sample №; in the ЯВ 
stratum. Thus, if n;=0, or n:=dN;, where dis an integral number, then 
this additional component is equal to zero. If, however, n;/N;>0 and 
non-integral, this additional component may be greater than zero. Dis- 
regarding the stratal index, we may then seek the value of n (the num- 
ber of cards to duplicate) which maximizes the sampling error of an 
estimate based on the N-++n cards, original and duplicates, and, coin- 
cidentally, examine the effect of card-duplication on the variance of the 
estimate. 

Assume that a random sample of N items be drawn without replace- 
ment from a population of size P and with mean д and variance c?, 
Draw n cards at random without replacement from the N, duplicate 
these n cards, and estimate и on the basis of the N+n cards. This last 
estimate is 


1 n N 
= —— 42 ; i 2 
5! hat dal @ 
апа 
g? Р-п P —(N —n) 
Var m = — — — }4, —__ И /(3 
"Gg Rope DET } se 
Considered as a function of n, 
2 - 
ҚАЛ АЗ SN oe 
Р-і 


for n=0 and n — N. Var т attains its maximum value when n— N/3. 
When 


2 ч 
^ = N/8, Varm = {е =, 
n \8(Р — 1) 


In order to study the effect of weighting by card-duplication on the 
variance of the sample mean, we might first compute the relative infor- 
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mation? of this estimate, where the relative information, computed as 
the ratio between Var 2 and Var т, is 
I- (1+ 4)? 
(34 + 1) — (50° — 2d + Yr 
Mis is the rate of duplication -n/N, and r is the sampling rate 


EXAMPLES OF THE EFFECT OF CARD-DUPLICATION ON 
THE VARIANCE OF THE SAMPLE MEAN 


(15-7) (4) 


Rate of Duplication 


Sampling 
ee 20% 33195 50% 663% 
Relative Information* (percent) 
ot 90.00 88.89 90.00 92.59 
.01 89.55 88.39 89.55 92,26 
.05 87.69 86.36 87.69 90.82 
.10 85.26 83.72 85.26 88.98 
.50 60.00 57.14 60.00 67.57 
.90 16.86 14.81 16.86 21.37 


Var 


ж Relative Information =: 2 100. The Relative Loss of Information due to card-duplication 


arm 
шау be computed by subtracting the appropriate Г value from 100 


mation «100% A 2 x100. 
+ А zero sampling rate corresponds to the case 
with replacement. 

When r —0, the condition of sampling with replacement holds, and 
1-(14-4)7/(84--1), a, function of the rate of duplication only. The 
table exhibits a few examples of the effect of card-duplication on the 
variance of the sample mean, m, for several sampling rates. The figure, 
exhibiting the graph of I as a function of d for various values of r, dem- 
onstrates the slight losses in efficiency due to card-duplication when the 
sampling rate is small. It also points up the shallowness of these curves, 
ie., the insensitiveness of I to changes in n within a broad interval 
about the critical value, п= №/3. 

When card-duplication is applied in the case of a stratified sample, 


L Nic i (5) 
i; N+n у 


3 R. А. Fisher, Statistical Methods for Research Workers, Tenth Edition, Edinburgh: Oliver and Boyd, 
1948, Section 55. 


9%, і.е., the Relative Loss of Infor- 


of sampling from an infinite population, or sampling 


m= 
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1. e + a % ^ 
| 


m\N+n 


= Ў we x еск 


а РТТ 


Pi — (Ni — ni 
+ (№: — ni) =») (6) 


where m; is the sample estimate of the mean in the ith stratum, N; and 
ти are as before, N= XN; n= Yn; and W;- (Nitn:)/(N +n). Com- 
paring expressions (1) and (6), we can see that Var т Var z is 0 for 

Effect of Weighting by Card - Duplication: 4 
Relative Information of the Weighted Sample 


Relative оз 
Information 


9 9 д з .ю $0 40 т 80 90 1.00 


Rote of Duplication 


* *- Ye sompling rote 


` 
4 
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ni=0 or N:. The least efficient case is the limiting one for which т; 
=N;/3. In this case, 
i 2 
Var т = 1.125 Var 4+ 0.55 W22. 
ӛзі 4 
Тһе increase in sampling variance for the stratified sample due to 
card-duplication is a resultant of the strata increases. Let the symbol 
for this increase be C?. 
In any practical operation it might be asked whether C? is larger or 


.smaller than the (bias)? due to incorrect strata weights that the weight- 


ing procedure is designed to remove. 
If w; are the incorrect strata weights and W; the correct ones, the 
bias due to the use of incorrect weights may be expressed as 


B? = { Xs i on (7) 


It would seem then that weighting by card-duplication is useful only 
when В,2> С°. On the other hand, nothing is to be gained and some- 
thing may be lost if Bu? S C°. 

This indicates that if the method of card-duplication is used to ob- 
tain estimates from а disproportionate sample, then whatever gains 
that might have been expected from the disproportionality could be 
seriously reduced by this weighting procedure. In other words, the 
gains due to the sample design must be at least large enough to over- 
come the loss of efficiency due to the weighting scheme. 


THE MATHEMATICAL BASIS FOR THE BEAN METHOD OF 
GRAPHIC MULTIPLE CORRELATION* 


Біснанр J, Еоотв 
Bureau of Agricultural Economics 


N 1929, Louis Н. Bean published an article describing а graphic 

method of multiple correlation which subsequently has been widely 
used, particularly in the field of agricultural economics.! In the late 
1930's, considerable controversy arose among users of the method. 
This controversy in part concerned the correct interpretation of the 
results obtained in terms of standard mathematical coefficients. To 
clarify these aspects, the writer, with J. Russell Ives, published in 1941 
а paper outlining in detail the relationship of the graphic method to 
the mathematical method of least squares.? The mathematical proofs 
in that paper were developed with the assistance of M. A. Girshick, 
then on the staff of the Bureau of Agricultural Economics. The ma- 
terial presented here is based mainly on that given in the 1941 paper, 
but includes certain closely related aspects not published at that time. 
Attention is concentrated on the relationship of the graphic method to 
the mathematical method of least squares. Adequate descriptions of 
the mechanics of the graphic method are available in a number of pub- 
lications.§ ——— 

Relationships between the graphic method and the least squares 
method can be explained most effectively in terms of the simpler cases, 
especially three-variable linear multiple regression. In such cases, if X1 
is the dependent variable and X; and X; are independent variables, 
when Ta із equal to zero, each partial regression coefficient is identical 
with the corresponding simple regression coefficient. When т»; is con- 
siderably different from zero, the device of “drift lines" (to be discussed 
later) facilitates estimation of first approximations to 0,2. запа bis.2 which 
are superior to by: and б. Successive transference of residuals leads to 
lines with slopes approximately equal to the mathematically-calculated 
values of без and 61.2, but the process is slow when r» is considerably 


dm ТЕН material was prepared for presentation at the 1952 meetings of the American Statistical 
ба pout ct Due to certain unavoidable complications, the session on graphic correlation was not 
Rape е pur Ex Foote and Ives referred to in footnote 2 was issued only in mimeographed form 
^ ТЕП, A Б le poly an libraries, it appeared worth while to publish this as a journal paper. 
Кырс ‘A simplified method of graphic curvilinear correlation,” Journal of the American 
"heal Association, 24 (1929), 386-97. А mimeographed publication containing essentially the same 
aerial was issued by the Bureau of Agricultural Economics. 
^ The relationship of the method of graphic correlation to least squares,” Statistics and Agriculture 
о. d us del peram Economics, 1941. (Processed.) 
ү example Bean, op. cit., or Thomsen, Frederick i , Agricultural 
Prices, MeGraw-Hil Book Co, New York, 1982, pp. 298-810. 0 Aor 
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different from zero. One problem which tends to slow up the speed of 
convergence is the inability of the research worker to draw least 
squares regression lines accurately. There is à common tendency to 
draw them too steep. Another problem is due to the fact that the itera- 
iive process converges more rapidly if measured by the mean square 
residual than if measured in terms of a particular partial regression co- 
efficient. Thus, in difficult cases, the first several rounds of the iterative 
process may yield a good approximation to Р; зз but poor approxima- 
tions to the regression coefficients б. and 0.2.2. These points are dis- 
cussed in more detail in the remainder of this paper. 


MATHEMATICAL MEANING OF THE DRIFT LINES USED IN 
THE GRAPHIC ANALYSIS 
If, in the equation 
Ху = + Xs +++ + bX, а) 
constant values are assigned to Xs, ---, Xp, then Хз - - : НХ» 


is equal to some constant that can be combined with the constant bi 
to give a new constant К. Equation (1) can then be written as 


X; = K + bX, (2) 


which is the equation of a straight line having a slope equal to 6». Here 
bs, which may be written as бэм... is the regression of X; on Xs when 


Хь - ++, Xp are constant. 
If two or more observations in a scatter diagram of X; on Х had the 
same or approximately the same value of Xs, · - ·, Xp, then an esti- 


mate of Мэ.з...› could be obtained by drawing a best fitting line through 
them. If this process were repeated for several groups of points having 
the same Xs, - - - , X, values, several lines whose slopes are estimates 
of the same partial regression coefficient, biz.s---p, would be obtained. 
The process of obtaining estimates of bi2.s---p from the slope of these 
lines is equivalent to breaking the total sample into selected sub- 
samples and obtaining from each of these an independent estimate of 
Вал... Such lines are the “drift lines” used in the graphic method. 

The closeness with which the average of these slopes approximates 
the mathematical partial regression line will depend upon the stability 
of the slopes of the individual drift lines. In general, the amount of 
fluctuation that may be expected in the slopes of the drift lines will 
depend on (1) the number of observations and the extent of variation 
in the X; values on which each is based and (2) the size of the partial 
correlation between X; and X» when X; is constant. 
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PROOF THAT THE SIMPLE REGRESSION IN THE FINAL CHART WILL 
EQUAL THE PARTIAL REGRESSION PROVIDED THE PARTIAL 
REGRESSION IN THE FIRST CHART HAS BEEN ESTI- 

MATED CORRECTLY (THREE-VARIABLE PROBLEM) 


Ав it is not obvious that the simple regression in the second or final 
chart of а three-variable problem will equal the partial regression even 
if the drift lines correctly estimate the partial regression in the first 
chart, а mathematical proof is given. Stated mathematically, the prob- 
lem is аз follows: if the deviations from the regression in the first chart 
are considered as a new variable Vi, so that 


Vi = Xi — БХ, 


and if the first approximation to by; is equal to its mathematically cal- 
culated value for the sample of data, we wish to show that the simple 
regression between V; and X; is equal to Dis. 

The following symbols, which may be new to some readers, are used. 


_ S(% - X)(Xs — X) i S(Xi — X)(Xi — Xy) 
УГО АР, VITSE HT SETA T VESPA di Eo y 


біз NA істі ete. 
an _ 508 - Xf Sm -X» | 
11 N-1 3 аз У etc. 


where X; is the mean of X;, ete., М is the number of observations in the 
sample, and S is the sum for all observations in the sample. 
The following transformations can be made. 
Ола = 8187719, hs = 8183713, etc. 
ап = 82, аз = 8", eto. 
where в, is the standard deviation of Ху, etc. 


ü "i сап be shown that in terms of the standard deviations and corre- 
ons 


8 712 — 713723 
ЕСО", 


= 3 
mA & 1 — т. e 

and 
ТЕН gu Tis — Ти” ү (4) 


$ 1 = 193? 


3 m is desired to determine the simple regression coefficient of Уі on Xs 
еп 


* 
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V; = (Xi — X) — byua(Xs — Х24 


Now 
A SVi(Xs — Xs) We S[Qà — X) — bes(6 — X) (X: — X). 
YiXs SX ы» Xj S(X; ras X3 
_ Gs — busta | 
= px 


Substituting the value of із, from equation (3) and simplifying, 


ss mn Ті — Ті А 
8 1 — Газ? 
By equation (4), 
2 = ЛЕЙ 


Except that more algebraic manipulations are required, it is equally 
easy to show that the simple regression on the final chart of a problem 
involving more variables will equal the partial regression bin.2s...n—-1, 
provided that all of the other partial regressions have been estimated 
correctly by the use of drift lines. 


MATHEMATICAL EQUIVALENT OF THE PROCESS OF 
SUCCESSIVE APPROXIMATION 

In the following paragraphs, a mathematical iterative or successive 
approximation method for obtaining the least squares regression co- 
efficients is briefly outlined. The notation applies to а four-variable 
problem. 

In the method of legst squares, the coefficients 612.4, D1s.21, and 6.4.23 
are determined by minimizing the quantity 


Топан, bis. Buss) = [(Ха — Xi) — bis (X — X3) 
= bis(Xs — Xs) — биз (Ха — X) |2. 


The solution yields the following three normal equations: 


biz. + 013.4003 + Бм.зйщ = (із (5) 
bizos + біз мб + изам = (аз (6) 
Бомба + Әз мам + Бм.зйи = (ам. : (7) 


4 For purposes of derivation, it is convenient to express Х and X in this equation in terms of devia- 
tions from their respective means. Actual values, however, are used in the mechanics of the graphio 
method. Since coding by subtraction does not affect the value of a regression coefficient, the proof ap- 
plies in either case. 


782 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


"These equations can be solved by well-known methods. 

In the iterative process, values b 2. and 63.4 are guessed for 
6:4 and із. respectively in equation (5) and a solution for bu.» (вау 
64.93) is obtained from this equation. Then 559, and М9, are 
substituted in equation (6) for bi. and bu.2s respectively and a second 
approximation for 612. (say 5915.34) is obtained. The values 09); м and 
bu. are substituted in equation (7) for b2. and ӛң әҙ respectively 
and a second approximation to bis. (say 513.4) is obtained. The val- 
ues 29), м and bss з are substituted in equation (5) for 0, и and bis. 
respectively and a second approximation to b1,:; (say 0%, ) is ob- 
tained. This process is repeated until the coefficients converge to stable 
values. 

The iterative process outlined above is equivalent to the following 
steps: (1) Assign values 00)» и and 05), э to biz. and bis. respectively 
in the function /(Біз,м, bis., биш) defined above. Find that value of 
биза which makes f(b 24, b(95 5, Би э) в minimum. Let it be b\ 14,25. 
(2) Find that value of 5 which makes Хзм, 09, 6 14.93) а mini- 
mum. Let this value be 09; з. (3) Find that value of bis. which makes 
Ло, Әзм, bM14.25) а minimum. Let this value be 6:4. (4) Find 
that value of b.s which makes Ло, 613.04, Баз) а minimum. Let 
this value be ОФ, ete. It will be seen that the steps involved in the 
latter process are identical with those for the graphic method involving 
three or more variables, as in each case deviations from the approxima- 
tions to the regression lines for the other independent variables are 
plotted against one of the independent variables and the preceding ap- 


proximation to that regression line is adjusted so that it appears to be 
the line of best fit.* 


PROOF OF CONVERGENCE TO THE LEAST SQUARES VALUES 
The problem of convergence is considered for three variables only. 
The normal equations for three variables are given by 
бала» + bi3.2023 = aie 
62.3423 + Әзлан = dis. 


If the iterative process is performed on these two equations, then the 


Kth approximation to biz. and bis. respectively can be shown to be 
equal to 


5 See Thomsen and. Foote, op. cit., pp. 299-304. 
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81 
bigs = = (ria — тыға + rites? — Tur + — + + А) 
2 
4 690) 572-2 (8) 
апа 
81 
Юва = "x (ris — тота + rura! — rers! + — + + runs?) 
3 
ЕЯ 2 
— DM дд (9) 
$3 
81 
= um; (rua — rure — rure? — Та + — сс — Tiros! K79) 
3 
+ 0919752872 (10) 


where МЖ), з and 2; are the Kth and 18 approximations respec- 
tively to біз and 00, з and b,» are the Kth and the 1st approxima- 
tions respectively to bis.2. But 


81 
bia = кр (ris — та + rire? — rura! + — ос” ) (11) 
and 
81 
за = T (ris — тиз + таба — rura! + — +++), (12) 
3 


which сап be obtained by expanding the denominator of equations (3) 
and (4) in an infinite series. 

Hence, comparing equations (8) with (11) and (9) or (10) with (12), 
it will be seen that 29), and 509. can be made to approximate 0.3 
and bis. respectively as closely as desired by taking K sufficiently large. 


SPEED OF CONVERGENCE OF REGRESSIONS 


The speed with which the successive approximations lead to stable 
results is of interest for two reasons: (1) It takes time to make succes- 
sive approximations and the charts become messy after several sets of 
dots have been inserted on them and (2) if the convergence is too slow, 
the analyst may think that no further correction is needed in the line 
with slope 5); when in reality its slope is still quite different from the 
mathematically caleulated value. 
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By making algebraic substitutions in equations (8), (9), and (10), it 
can be shown that 


bia — 000155 = т? (bia — b 0133) and (13) 
8; 

bia — bU. = — пц — Базз) (14) 
5 

= raK? (bis.2 — 613.2). (15) 


Equation (13) states that the difference between the mathematically 
calculated біз and any given approximation is equal to a function of 
the correlation between the independent variables times the error that 
was made in the first approximation to 012.3. It shows that the higher the 
correlation between the independent variables, the slower will be the 
speed of convergence. $ 

Using the size of the original error (that is by ;—b 23) as a base, it 
can be stated from equation (13) that the percentage of error left after 
the Kth iteration is given by ға times 100. Thus, if г, 0.2, the error 
remaining after one iteration is 4 per cent of the original error and after 
two iterations is 0.16 per cent, while if 7-0,9, the error remaining 
after one iteration is 81 per cent of the original error. After two itera- 
tions it is 65.61 per cent. 

Equation (13) also indicates the importance of the drift lines, since 
if the error in 5755, is small, one or two iterations may be enough to 
yield a fairly accurate approximation to the mathematically correct re- 
gressions, but if the error is large and the correlation between the in- 
dependent variables is also large, 6 or 8 or even more successive ap- 
proximations may be required to bring the slope of the regression to 
within 0.1 of the correct value.* It is &ssumed-in the graphic method 
that the successive approximation process is continued until a visual 
inspection indicates that no further improvement is possible. 


_ SPEED OF CONVERGENCE OF MULTIPLE CORRELATION COEFFICIENT 


| The size of the multiple correlation coefficient depends upon the 
sizé of the deviations (or unexplained variation) from the final regres- 
sions. If the regressions are inaccurate, the computed multiple cor- 
relation coefficient will be inaccurate. In a three-variable problem, if 
"аз 18 near zero, convergence is rapid and errors іп 8:2 and б.з are apt 


q 
| 
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to be small, but any errors made have relatively large effects on £s. 
When 7 is near unity, convergence is slow and errors in Әз and 613.2 
may be large, but these have relatively little effect on Rı.z or on the 
errors in predicting X;. The same general reasoning applies to problems 
involving more variables. 


INTERPRETING THE CORRELATIONS INDICATED 
IN THE SCATTER DIAGRAMS 


In general, the regression lines obtained in the several charts of the 
graphic method have been interpreted correctly as “net,” that is, par- 
tial, regressions between the dependent variable and the separate in- 
dependent variables. Some confusion has occurred in interpreting the 
correlations indicated by the plotted observations in the scatter dia- 
grams. This point can be cleared up if one is careful to note the exact 
meaning of each of the two variables represented by the horizontal and 
vertical scales of the charts, and considers the “visually indicated” cor- 
relation to be the simple correlation of these two variables. 

In the first chart of a three-variable problem, the two variables repre- 
sented by the vertical and horizontal scales are simply the dependent 
variable and one of the independent variables, X». Hence, this chart, as 
originally plotted, indicates the simple correlation, 712. 

With respect to the second chart, if X,;—bw,sX2 is considered as a 
variable, Vi, and the simple correlation between V; and X; is obtained, 
the resulting correlation will be equal to the part correlation i», 88 
defined by Ezekiel.” In this sense we can say that the second chart in- 
dicates the part correlation 1/2. Likewise, if Xı—bıs.2X3 is considered 
as another variable, Va and the simple correlation between Уз and X» 
is obtained, that correlation will be equal to the part correlation 127%. 
If X, were used as the Second independent variable instead of Xs, the 
second chart would then indicate 17%. Since the final dots plotted around 
the final regression line in the first chart give the same result as would 
have been shown in the final chart had the variables been reversed, 
this scatter represents 127з. Similar results are given for а problem in- 
volving more variables. The final dots plotted around the final ap- 
proximation to the regression line in each of the charts represent the 
part correlation between the dependent variable and the respective 
independent variable. 

Part correlations as such do not appear to have much meaning in the 
interpretation of an actual problem. However, by making certain sub- 
stitutions in Ezekiel’s formula for part correlation, it can be shown that. 


1 Ezekiel, Mordecai, Methods of Correlation Analysis, Ed. 2, John Wiley and Sons, Inc., New York, 
1941, p. 213. ї 
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Tia 
JP (1 — тылай 


Since, in the denominator of this formula, the quantity 1 —77,5.» is non- 
negative and less than unity, the part correlation between two vari- 
ables is always equal to or greater than the partial correlation between 
the same variables and the difference increases as rs; increases. Because 
of this relationship, charts indicating the degree of part correlation 
can be used as an indication of the approximate size of the partial cor- 
relation. If the indicated part correlation is low, the partial correlation 
must be low, regardless of the correlations between the independent 
variables, as in no case will the partial correlation be higher than 
the corresponding part correlation. If the part correlation is high, then 
either the corresponding partial correlation is relatively high or the 
correlations between the independent variables are very high. In a 
three-variable problem, for example, if the part correlation was 0.9 or 
above, the corresponding partial correlation would be 0.77 or above 
unless 7% exceeded 0.8. 


СУЫ 


DEVIATIONS FROM THE REGRESSIONS 


Some investigators have been puzzled by the fact that the deviations 
from the regression lines in certain charts are exactly equal. If the 
mathematically calculated values for the partial regression coefficients 
are obtained, the deviation of the final dot for any given observation 
from the regression line in each chart will be identical. Likewise, if 
calculated values for the dependent variable are plotted against actual 
values and a line is drawn through the origin with a slope of 1, the 
deviation of the calculated value from this line for any given observa- 
tion will be the same as the deviations discussed above. (The simple 
correlation between these two variables equals the multiple correlation 
for the analysis.) This follows from the fact that the deviation for the 


ith observation in each of these charts (for a three-variable analysis) is 
given by 


dy pr Xu ERA bis. Xs; i bisoXsi. 


Different degrees of correlation are still indicated by the various 
charts, as the degree of correlation reflects not only the deviations of 
the dots from the respective regression lines but also the relative range 
in the dependent variable involved. This is clear from the definition of 
the coefficient of determination, which is the percentage of variation in 
the dependent variable explained by the independent variable. The 
sum of the squared deviations represents the unexplained variation ог 


кекс: ШШ 


BEAN METHOD OF GRAPHIC MULTIPLE CORRELATION 787 


the total variation іп X, minus the variation explained. But to trans- 
late this into а correlation coefficient, the total amoünt of variation in 
the dependent variable to start with must also be known. In the chart 
indicating the degree of multiple correlation, this is the total variation 
in Ху. But in the charts indicating part correlations, it is the amount 
of variation remaining in X; after adjusting for the effects of the alter- 
native independent variables. 


EFFECTS OF NOT PASSING THE REGRESSIONS THROUGH THE MEANS 


In the original description of the graphic method no mention was 
made of drawing the regressions through the means of the variables. In 
his original article, Bean stated: “At this point it may be observed that 
the arbitrary placing of the approximation curves without reference to 
the average values of X; and of the other variables does not affect the 
values of X; computed from the curves. For example, had the approxi- 
mation curve in section 1 been placed higher, the residuals іп sections 
2 and 3 would have been correspondingly decreased and the curves 
lowered."5 The truth of this is fairly obvious. 


APPLICATION TO PROBLEMS INVOLVING MORE VARIABLES OR 
CURVILINEAR RELATIONSHIPS 


Most of the mathematical proofs have been given for problems that 
involve three or four variables. The extension to problems involving 
more variables is obvious, although the algebra becomes complicated. 

'The graphic method was developed primarily to handle curvilinear 
rather than linear relationships. The proofs given here are in terms of 
linear relationships because the least squares method, as usually con- 
sidered, is applicable to linear relationships or those that can be trans- 
formed to a linear forfa. Thus, it is easier to show the relationships 
between the graphic and the mathematical methods by confining the 
diseussion to the linear case. It has been generally recognized by 
mathematicians and can easily be demonstrated by example that the 
graphic method provides at least a satisfactory method for obtaining 
approximations to the net regression curves when dealing with multiple 
functional relationships, regardless of whether the nature of the func- 
tion is known. The extent to which the graphic method can be used to 
determine the nature of curves for stochastic or probability relation- 
ships will depend mainly on the degree of correlation and the extent to 
which the sample represents the population. As, in most cases, one 
never knows for sure whether a given small sample is representative of 
the population, any user of regression methods must proceed with cau- 


8 Bean, op. cit., p. 393. A mathematical proof is given in Foote and Ives, op. cit., pp. 32-33. 
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tion and must subject his final results to common sense and to any other 
outside checks at'his disposal. Some users of the graphic method, as 
well as many users of mathematical methods, have assumed that their 
methods, as such, are sufficiently reliable so that outside checks are not 
necessary. | j 


SUMMARY 


The method of graphic multiple correlation suggested by L. Н. Bean 
essentially is based upon three mathematical principles: 

(1) The multiple regression equation becomes the equation of a curve 
when all of the independent variables except one are held constant. In 
the case of linear regression, the curve is а straight line whose slope is 
equal to the partial regression coeffieient between the dependent vari- 
able and that independent variable which is permitted to vary. For this 
reason the slopes of the drift lines in the first chart of a three-variable 
analysis indicate the partial regression coefficient. 

(2) If in a three-variable analysis, the true partial regression line is 
obtained in the first chart, the simple correlation between deviations 
from this line and the second independent variable is equal to the part 
correlation (as defined by Ezekiel) and the simple regression is equal to 
the partial regression. Thus, the line obtained in the second chart of a 
three-variable analysis approximates the second partial regression. If 
the degree of correlation between the two independent variables is low, 
the part correlation nearly equals the partial correlation. Hence, the 
scatter in this chart in most cases indicates approximately the degree 
of partial correlation. 

(8) The method of successive approximation outlined by Bean is 
analogous to a mathematical iterative process which converges to the 
east, Squares solution. Thus, even if an error is made in the first ap- 
proximations to the regressions, succeeding approximations will tend 
to yield more and more accurate results. 

_ The speed of convergence depends chiefly on the size of the error in 
. the first approximation and the size of the correlation between the in- 
dependent variables. The better the first approximation and the 

smaller the intercorrelation, the faster will the process tend to converge. 

‘The degree of intercorrelation is determined by the nature of the vari- 

ables included in the analysis and hence, once the variables are chosen, 

very little can be done graphically to speed up the convergence. How- 
ever, the accuracy of the first approximations may be greatly enhanced 
by the use of drift lines. 

‘The same reasoning can be extended to problems that involve more 
than three variables, 
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RECENT ADVANCES IN FINDING BEST 
OPERATING CONDITIONS* 


R. L, ANDERSON 
Institute of Statistics, North Carolina State College 


HIS paper discusses various experimental procedures used to esti- 

mate the optimal point on a response surface and to explore the 
nature of the response surface in the vicinity of this optimum. Multi- 
factor experiments were first set up to investigate one factor at a time; 
then Fisher and Yates introduced the complete factorials for field 
experiments, plus confounded arrangements for incomplete blocks 
designs. More recently, fractional replication designs have been intro- 
duced to cut down the size of the experiments. 

Hotelling devised methods of locating the optimal point using a sin- 
gle factor. Friedman and Savage outlined а sequential one-factor-at- 
a-time procedure when several factors are involved. 

Box and Wilson present a method of locating the optimum and of 
exploring the response surface in which many factors are varied at the 
same time. They present the use of the path of steepest ascent to get to 
a “near-stationary” region if the experimenter starts at a point far 
removed from it. When the experimenter is near such a region, they 
present the use of a composite design to estimate quadratic and inter- 
action effects. The nature of the response surface is explored by the 
use of a canonical transformation. 

The usefulness of these sequential procedures in various experimental 


situations is discussed. 
1. INTRODUCTION 


Most experimentation has as its ultimate objective the estimation of 
some optimal response. However, the lack of a simple experimental 
procedure to achieve this objective has resulted in a tremendous num- 
ber of piecemeal experiments, each designed to pinpoint some section 
of the response surface. This paper will discuss some contributions to 
the problem of maximizing a function, 

y = Ф(0, ‚+, Tr), (1) 
where y is the expected response and z; the amount of the ith factor 
used in producing y. For example, a quadratic response function might 
be written in this form: 

* Revision of a paper presented at the 1952 annual meeting of the American Statistical Association. 
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k 


k k-1 k 
y = в + ва: ++ D Bui у, У) бушл (2) 
i=l 


i=l i=l j=t+1 


where В; is called a linear or main effect, 8;; a quadratic effect and 
Ву (75) an interaction effect. Of course, the optimal response may be 
а minimum (such as with costs), but the procedures are the same for 
determining a minimum as for а maximum. In general the word 
“optimal” will refer to either case. The problem is one of finding the 
level of each factor to achieve optimal y, assuming that the factor levels 
can be continuously varied. The combination of factor levels which 
produces the optimal response will be called the optimal factor combina- 
tion. 

In general, it would be advantageous to know the response function 
itself. For example, most production is carried on for a profit, But the 
optimal factor combination usually will change with a change in the 


factor and product prices, Assume the production function is of the 
form 


q = O(a, 22, - -° a), 


Where the parameters of q have been estimated. The profit is then 
k 
т-ар- jsp, 
i=l 


where p is the price of the product and p; the price of the ith factor. 


Then the static optimal factor combination is given by the solution of 
the k equations 


= il, 2a k. (3) 


Of course, the dynamic solution is more complicated, because changes 
în g and the z; can be expected to change p and the ру, especially if this 
product is ап important part of the economy. In fact, one might need 
to know the demand function for the product and the supply functions 
for the factors. But the important point to note here is that, once these 
functions have been determined, the determination of the optimum 
requires no more experimentation. 

Experimental procedures for estimating the parameters of а multi- 
factor response function are now being developed. Box [3] discusses 


du E designs for estimating a planar response surface. He shows 
that 
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(i) when prior knowledge of the response surface exists, the design may be 
rotated to reduce possible biases (e.g. quadratic and interaction), and 
(ii) rotation can be used to eliminate such systematic effects as time trends. 


Box and Wilson [2] describe methods of exploring the response surface 
in а *near-stationary" region. 

Before starting any experimentation to explore the response surface, 
the experimenter must select the factors and the factor levels to be 
used in the experiment. The factors are usually decided on the basis of 

(i) previous experimentation and theoretical study in the field, 
(ii) practical consideration of factors which can be varied in the production 


process and in the experiment, and 
(iii) time and facilities available. 


Тһе selection of the factor levels is usually а matter of judgment on 
the part of the experimenter. He considers the possible range of the 
factor levels and previous experience on the differences in levels needed 
to produce detectable response changes, if such exist. These problems 
are common to all the experimental procedures to be discussed in this 
paper. They are largely non-statistical problems, but the statistician 
should be sure that the experimenter understands the importance of 
selecting the correct factors and suitable factor levels. 


2. FACTORIAL EXPERIMENTS 


In the first multi-factor experiments, а single factor was varied at · 
a time. For example with 5 factors, one might plan БЇ experiments, 
in which each of the factors in turn was used at / levels while the other 
4 factors were held at some starting level. Fisher [9] and Yates [22] 
encouraged the use of complete factorials and developed а large num- 
ber of special designs involving them. In a complete factorial, all 
combinations of the fattor levels are used, e.g., Ё for the above experi- 
ment. These designs were developed for experiments in which the 
experimental error could not be neglected. In order to estimate the 
magnitude of this error in each experiment, the experiment had to be 
repeated several times, say r. These factorial designs were formed large- 
ly for field experiments in which sequential experimentation would be 
less useful than with laboratory experiments, and the factors were 
often of the discrete type, e.g. varieties or rations. 

Because of the large number of factor combinations required in many 
field experiments, it was felt that some form of incomplete block 
design was needed to reduce the experimental error. This resulted in 
the so-called confounded designs, e.g. with 2*, 3°, З Ж2*, 3* X2, 4* designs. 
These are described by Yates [23]. More complicated factorial designs 
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have been constructed by Nair |17, 18], Bose [1], Finney [8], and Li 
[16], among others. 

When physical scientists and engineers became interested in multi- 
factor experiments, they found that complete and confounded factorials 
required too many experimental units, especially since the experimental 
errors were often much lower than in field experiments. One method of 
reducing the number of experimental units was to use higher order 
interaction effects to estimate the error and hence avoid repetitions 
of the design. Then Finney [6, 7], Plackett and Burnam [19], Kemp- 
thorne [14], Rao [20], and Davies and Hay [5] developed the fractional 
replication designs, based on using parts of the confounded designs. 
Yates [22] and Hotelling [13] had already mentioned the use of such 
designs. А new approach for continuous factor levels has been suggested 
by Box [3]. 


3. LOCATING THE OPTIMAL POINT USING A SINGLE FACTOR 


Hotelling [12] considers in detail the problem of obtaining a maxi- 
mum response when only one factor is involved, advocating the fol- 
lowing experimental procedure: 

(i) An early speculative study of the problem to indicate the range within which 

the optimum lies. 


This study should also include some good theory to help delimit the 
problem. 


(4) An intermediate stage to supply estimates of the parameters of the response 
function. 


One might use six equally spaced values within the range in (i) to fit a 
fifth degree polynomial and estimate the optimum point 2. If several sam- 
ples are obtained at each point, one can also estimate c. 

(iii) A final experiment. 


m 2 measure the deviation from £ and assume the true response equa- 
ОП 18 
70) = Bo + Biz + Ва? + fast + Batt... , (4) 


Assume f(z) can be approximated by a quadratic equation, so that 3 or 
more values of z are needed. 


The estimates of the parameters in the quadratic equation will 
be biased if вв, 8, · -- in equation (4) are not zero. Hotelling shows 
how to allocate N sample values so as to make the cubic bias zero and 
the quartic bias a minimum, assuming the variance is fixed. 

Hotelling briefly studied the case of two factors, for which 6 points 
are needed to estimate linear, quadratic and first order interaction 
coefficients, In order to make the cubic bias vanish, he established that 

(i) no 8 points lie on a straight line, 
(ii) no 4 points lie on 2 straight lines through the origin, 
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(iii) no 4 points can be vertices of а parallelogram, 
(iv) the 6 points cannot consist of the origin ‘and the vertices of a regular 
pentagon with center at the origin. 


4. А SEQUENTIAL ONE FACTOR-AT-A-TIME DESIGN 


Friedman and Savage [11] described а sequential multi-factor plan 
to locate a local maximum and to describe the response surface near 
this maximum. They wanted to explore the response surface near the 
maximum in order to 

(i) indicate the seriousness of choosing a factor combination somewhat 
different from the maximum in order to protect other qualities than the 


one studied, 
(ii) determine the relative importance of various factors, 
(iii) serve as a stimulus to develop the theoretical nature of the response, 
(iv) indicate the seriousness of a lack of control of factor levels in the produc- 


tion process, 


They reject the complete factorial design because 
(a) the levels chosen may be far from the maximum, 


(b) if one chooses levels too far apart, he may obtain a very superficial de- 
scription of the response surface near the maximum. 


In addition, they point out that the factorial design is essentially a 
discrete level design and does not take account of the essential con- 
tinuity or ordered character of many factor levels. This is tied in with 
the Hotelling results, which show how one can improve the estimate of 
the maximum by choosing the levels at unequal intervals and by using 
a different number of samples for each level. However, the use of or- 
thogonal linear forms simplifies tests of linear, quadratic, and higher 
components when the Jevels are equally spaced. 

Тһе Friedman-Savage procedure is as follows: 

(i) Use the best estimate of the optimal factor combination as the initial 


one. 

(ii) Order the factors in some manner. The authors do not say how to do this, 
but one might order them according to his estimate of the possible effects 
of changes of each on the final response. For example, if one factor were 
very important, the experiments might not detect differences for other 
factors unless this first factor were near its optimal level. 

(ii) Vary the levels of only the first factor until an approximate optimum 
was located for it. Presumably the Hotelling idea of fitting a polynomial 
would be useful if the levels were continuous. 

(iv) Using the optimal level of the first factor and the starting point of all but 
the second, find the optimal level for the second; proceed in this manner 
until all factors have been investigated. 

(v) If necessary, repeat another round, but start with the set of local optima 


in (iv). 
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(vi) If the changes in the second round indicate a need for further experimenta- 
tion, it might be advisable to proceed on a path defined by the two sets 
of local optima. This is similar to a device used by Box and Wilson, which 


will be explained later. 


Friedman and Savage suggest that the differences between factor 
levels should be reduced as one gets closer to the optimum. This en- 
ables one to map the surface near the optimum. However, if the experi- 
mental error is very large, this error may mask the small response 
differences near the optimum. 

Friedman and Savage made a number of comparisons, showing the 
smaller number of experiments with the sequential plan as compared to 
complete factorials. However, they did not make comparisons with 
fractional factorial designs. If there are many factors, it is possible to 
use a small fraction of the complete factorial without confounding 
main effects and 2-factor interactions with each other. For example, 
1/8 of а 210 design will enable one to estimate all main effects and 2- 
factor interactions if all 3 and higher-factor interactions are negligible, 
and similarly for 1/9 of а 37 design [see Kempthorne [15], Sec. 21.7]. 
If previous information indicates that certain of the 2-factor interac- 


tions can be neglected, the designs could be even further fractional- 
ized. 


5. A SEQUENTAL DESIGN VARYING MANY FACTORS AT A TIME 


А recent article by Box and Wilson [2] is devoted to the problem of 
determining optimal factor combinations in chemical investigations. 
Their methods also enable the response surface to be described in the 
neighborhood of the optimum. The discussion of these methods in the 
1951 article was condensed for publieation. After discussion with Dr. 
Box, this writer believes the following is a correct description of the 
Box-Wilson techniques. | 

(i) Conduct some initial experiments in the vicinity of the previously 
known best factor combination. These initial factor combinations prob- 
ably would be based on а complete or fractional factorial design, usually 
of the 2* type. The 2 designs are simple to analyze and interpret and 
give good estimates of main effects and 2-factor interactions. 

__ Mii) Tf the main effects in step (1) are large compared to the 2-factor 
Interactions (i.e., the response surface is roughly planar in the region 
of these initial factor levels), the experimenter would be led to try new 
factor levels Which are changed in the direction of largest response in 
the initial experiments. Box and Wilson explore with new experiments 
the path of steepest ascent (or descent if a minimum is desired), in which 
each factor is varied Proportionally to its unit effect in the initial ex- 


ит 
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periments. The procedure of steps (i) and (ii) is repeated until the 
first order effects are small, so that no further progress is possible by 
this method.! The experimenter is then brought to a near-stationary 
region. E 

A technique is provided for avoiding gross errors in selecting the 
ranges of the factor levels in the initial set of experiments. 

(ii) When the experimenter has reached a near-stationary region, 
he conducts some additional experiments specifically designed to esti- 
mate the quadratic and interaction effects in equation (2). The 3* 
designs have been developed to do this; however, the size of a 3* experi- 
ment becomes unwieldy for large k. One notes that the 2-factor inter- 
action effects for a 3* experiment can be divided into four groups: 
linear Xlinear; quadratic Xlinear; linear Xquadratic; and quadratic 
Xquadratic. Presumably the latter effects (which are of the fourth 
degree) would be negligible, and perhaps the middle two groups of 
effects (which are of the third degree). Hence one might like to use а 
design which would enable him to estimate only the linear, quadratic 
and linear Xlinear interaction effects [the parameters in equation 
(2)]. 

Box and Wilson's composite design was developed to accomplish this 
purpose. One form of this design is to add (2k--1) experiments to the 
last set of 2* experiments in step (ii). If we designate the factor levels 
at the center of this 2 design as (0,0, · - - , 0), the new factor combina- 
tion would be 
(0, 0, y , 0); (+a, 0, а ,0); (0, zem ,0); ERN ; (0,0, а +a) 
a can be determined either so that the design is orthogonal (the esti- 
mated effects are all non-correlated) or so that the second order effects 
are estimated with equal precision. If the factorial experiments had 
indicated that the optimum was near one of the corners of the factorial 
design, the center for the composite design could be located at this 
corner. 

(iv) Once the experimenter has obtained rather stable estimates 
of the parameters in equation (2), the optimal factor levels, 


20 = (11°, 250, - ++, 22), (5) 
can be estimated. The predicted response for this factor combination 
is = 

y = bo +4 Уа, (6) 


1 When the response becomes almost stationary in the first path, a new set of 2* experiments is 
conducted to determine if the first order effects are small in this new region or if a new path should be 
followed. It should be pointed out that if the main effects had been small compared to the interaction 
effects in the initial experiments to step (1), step (ii) would be omitted. 
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Where b; is the estimate of 8; in (2). After shifting the origin of the 
system to 2%, the quadratic form (2) can be reduced to the canonical 
form 


I= y+ У мха (7) 


where 9 is the estimate of y and X; are linear functions of the z;. These 
X; are the axes of a coordinate system with center at z^. 
(v) The following tentative conclusions could be made: 


(a) If z^ is not far removed from the experimental center and the A; in (7)are 
of the same sign, the experimenter can conclude that y? is near the true 
optimum response. He probably would conduct several confirmatory 
experiments with factor combinations near z? and then reevaluate 2° 
until fairly stable results were obtained. 
If one of the X's is small relative to the others, the response surface has a 
ridge along the corresponding X-axis. Dr. Box states that, “often the 
most important practical problem is to determine the nature of the local 
ridge system." If a ridge is present, the experimenter can then use as the 
optimal factor combination the one along this ridge which is cheapest or 
easiest to use or the one which produces the optimal response for some 
other characteristic. 

(c) If some of the larger X's in (7) are of opposite signs, the experimeter is 
at a saddlepoint; Box and Wilson outline additional experiments to use 
in this case. 

(d) If ° is far removed from the experimental center, equation (7) could not 
describe the surface at 29, In this case one would suspect the existence of a 
rising ridge along the axis of X;, say, with a small value of ^, in (7). It 
would not be advisable to shift to an origin on X; but near the experimental 

|. center and obtain as a substitute for equation (7): 


(b 


m 


k 
Ü = у' + ВХ AX + > мхи, (7% 
=: 


where y’ is the predicted response at this new origin. The experimenter 
would then explore along the X ,'-axis, 


Аъ independent successful use of the Box-Wilson techniques is 
Biven by Read [21]. 
f 6. CONCLUSIONS 
This paper has presented some recent ideas on the use of sequential 
methods to estimate optimal factor combinations and to explore the 
nature of the response surfaces in the vicinity of these optima. The 
reader should be cautioned that the success of these sequential pro- 
cedures depends on the following conditions: 
@) The experiments can be sequentialized. 
i) The factor levels can be varied continuously, 
Gii) The experimental error is small and generally well estimated in advance. 


If some of the factor levels are discrete, it may be necessary to locate 
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an optimal combination of the other factors for each discrete level 
апа use the best of the local optima. However, % may be possible to 
find characteristics of the discrete factors which are continuous, for 
example, genetic features of varieties, chemical compositions of differ- 
ent soils, or average educational or economic features of different 
human groups. Hence, one of the objectives in future research may be 
the quantification of qualitative factors. 

The use of sequential procedures in biological and social experimen- 
tation may be limited because of the length of time required to conduct 

. the experiments and the presence of large experimental errors. In many 

cases, however, response changes over time can be measured by the 
introduction of additional experimental factors. If such time changes 
can not be estimated directly, it may be necessary to use some control 
factor combinations with every new set of factor combinations. If 
controls are needed in the sequential procedure, it may be more efficient 
to use larger initial factorial experiments. Kempthorne [15] discusses 
the use of fractional factorials in incomplete blocks design. If repli- 
cations are needed to estimate experimental errors, replicate experi- 
ments also can be performed in sequential experimentation. It would 
appear that, even in the biological and social fields, the sequential 
methods discussed here should be useful in planning many long-term 
experiments. Here is a place for coordinated research at several re- 
search centers—to avoid duplications and serious omissions in the 
factor combinations used.’ : 

Two methods of conducting multi-factor sequential experiments 
have been discussed: the use of one-factor-at-a-time and the Box- 
Wilson procedure of varying several factors at once. Another method 
might be mentioned--a procedure based on the random selection of 
factor combinations. It would be useful to have these three methods 
compared in various experimental situations. This would seem to be 
a useful statistical research project. Аз more response surfaces are 
explored, it will be useful to know how many of them have ridge sys- 
tems. Presumably the one-factor-at-a-time approach will not be very 
efficient in the exploration of a ridge. In particular, this approach 
would not tell the experimenter that the optimal factor combination 
can be located anywhere along this ridge. In most production, several 
responses must be optimized at the same time; hence, a good descrip- 
tion is needed for each response surface, for example, costs of produc- 
tion, yield, and quality of product. 


в sequential experimentation in a genetics experiment and Bross 


*Fi 10] discusses of 
pom, жалы ‘neither of these articles is concerned with the estimation of an 


[4] in medical experiments; however, 
optimal factor combination. 
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A NOTE ON REGRESSION WHEN THERE IS 
EXTRANEOUS INFORMATION ABOUT ONE 
OF THE COEFFICIENTS 


J. DURBIN 
London School of Economics 


1. INTRODUCTION 


UPPOSE we have a sample of n observations corresponding to the 
regression model 


У = a + ВХ, + В:Х e 


where the n values of e are independent of each other and of the z's and 
have zero means and variance o°. In addition to this sample we are 
given from outside an unbiased estimate b; of 61 together with an un- 
biased estimate sı? of ai? its variance. What is the best way of using this 
information to estimate 62? 

Situations of this kind arise in econometric work in combining cross- 
section and time-series data. For instance, in a demand study we may 
wish to estimate the price elasticity of demand from a time series of 
observations using at the same time an estimate of the income elasticity 
obtained from a budget survey. 

This problem was put to me when I was a research worker at the 
Department of Applied Economics, Cambridge, by my colleague, 
M. J. Farrell. Later developments were worked out in co-operation 
with Richard Stone, who kindly supplied the data for the numerical 
example. 

2. SIMPLE METHOD 


The simplest procedure is to accept b; as the estimate of 81 and to 
estimate в» by considering the regression of Y — 5X; on Xa Denoting 
by y, 21, 2 the deviations of Y, X;, Хз from their sample means, the 
estimate of f» is 

У (у — Ьа) 
= Ya 

DD Xy — ыў, 2172 
[ысу ниси. a) 

Уат 


bs 


For given bi, 


E(b| b) = № — (br — 6) vx f 


799 
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- Thus b. is conditionally biased. However H(b:) = 81, so that as b, 
varies E(b;) = В», i.e. 6 is unbiased. For fixed z's the variance of b; is 
iE {y - Bii — Bote) — (bı — due] 
Е Rane А лл SOLIS Инан PL ЛЫС MN 
эү 
g? d (x тұла)! 
Da? (22 22)? 
с? 
қауыша. dhia? 
= Zo ah, p: 
where бі is the regression coefficient of 2; on 2». 

We can compare this with the variance that would have been ob- 
tained if the extraneous information had not been used. In that case 
the coefficients would have been estimated by least squares, the vari- 
ance of the estimate of 8; being 


V(b) FH 


с? 


Ха =)’ 
Where r is the observed correlation between 2, and т. Now 
с? rici rio? уз zi 
Xsü-s5 Уна Xo 


Thus the diminution in variance due to the use of the extraneous in- 
formation is 


Y(b) = 


[zzi 2, 


which is always positive if 


2 о? 
oo 
"7 Жаәң-»)” 


i.e. if the variance of the extraneous estimate of 6; is less than the vari- 
ance of the internal least-squares estimate of Bi. When т —0, there is no 
improvement, as is otherwise obvious since the estimate of Өз is unaf- 
fected by the extraneous information about В. 

To estimate V(b») we need an estimate of c*. This can, of course, be 
calculated from the internal least-squares analysis in the usual way, 
and this will generally give the most convenient estimator to use in 
practice. A slightly more efficient estimate which takes into account 


| 
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the information contributed by the external estimate of Ві can, how- 
ever, be obtained as follows: 


У (y – bits — Вл)? = Ж {u = bızı — baro) + (0 — Вата} 
= У) (у — bm — b)? + (b2 — [3070 
Taking the expectation of the left-hand side we have 


EY, (у — bi — Bats)? = ЕУ) {(у — В — Ваз) — (и — Ваза} 
= (n — De? + e? 22 mt. 5 


(The factor n—1 occurs instead of n since the observations are meas- 


ured from the sample means.) 
o, (22 tits)? 
Eb: — p) E 20 = о + и from (2). 
"Thus, 
(n — 2)? = EY; (y — b — bets)? — (1— 12)? У) a. 
Consequently an unbiased estimate of c* is given by 
1 
8° = ut (У) U — bim — bam)? — 0 — rA 2]. 
PR 
The first term in the bracket may be evaluated by means of the iden- 
tity 
У (y — bum — dar)? = 2) (y — biz)? — 5. У (y — Ыл) 
= 52 f= 2b: >> ay + bi» its bi ©. 
Substituting $ апа а? for о? and cj? in (2), we get the unbiased estimate 


of V(b), 


is 1 
UE HENCE — Ыт — byte)? 
V (by) NCC [ Dd (у— bm — bm) 


Unfortunately this is not distributed as à multiple of x? in the normal 
case, and cannot therefore be used to construct an exact t test of ba. 
For sufficiently large samples an approximate test may be obtained 
by regarding (bs—6:)/-/V (b:) as a normal variable with zero mean 
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and unit variance. Alternatively, а better approximation could be con- 
structed by the method proposed by Welch [3]. 
Similar results are found for the general regression 
Y = о + ВХ, + ВХ +e + В.Х, + 6 

where we have an extraneous estimate b; of В1. Let 62 denote the vec- 
tor {8% +++, Be}, y, x the vectors of deviations from sample means 
of Y, Xi, and let X; denote the matrix of deviations from the sample 
means of Xs ---, Х,, Then the estimate of 8» obtained by straight- 
forward substitution of 6, for В is 


by = (Xo! X) X (y 52 бх). 
The variance matrix of this set of estimates is 
V(by) = (ХХ) + e? (ХХ) Хх Xo (X X2) 71 
P(X Ха) + buby’, 7 
where бізін the vector of sample regression coefficients of ті Оп 2%”, 
2+. Ап unbiased estimator of о? is given by 
1 
т-Б 


1 


І 


g= 


[DE (y — bui - e by)? — a- Rds У) 212], 


where Ei is the multiple correlation between 21 and zs +--+, ть. 
A slightly more efficient estimator of the multiple correlation of y on 
ті and 2; than that given by least squares is R defined by 


(n —k + 1)s? 
Ly 
3. EFFICIENT METHOD 


The above procedure, though simple and direct, does not make the 
most efficient use of the available information, since no attempt is 
made to improve the estimate of В: by means of the multiple regression 
data. 'The question of efficiency can be explored by considering the fol- 
lowing general problem, 

Suppose that 5; is а vector of unbiased estimators of a set of param- 
eters &- (в, ..., 6x} and that b; is an independent vector of un- 
biased estimators of the extended set 8—18, ---, Ва) where kzh. 
Thé variance matrices V(b,)=V; and У(Ь:) = Уз are assumed to be 
known and of ranks № and X respectively. We seek the best unbiased 
linear estimators of 8 - . . › Bx, ie. those which are linear in the ele- 
ments of b; and b; and whose variances are not greater than the vari- 
ances of any other unbiased linear estimators. 


1-8 = 
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Now 
Lh 0 


E bi L 
[мо jo 


Ок 


where I, Ть— are unit matrices of orders h and k—h, and 0 represents 
any matrix, all of whose elements are zero. Also 


b у 0 
v] узы = ly 4 
b: 0 V 


We now apply Aitken's [2] extension of Gauss's least-squares theo- 
rem. This states that if Е(х) = Ра and V(x) - V, P being а known ma- 
trix, then provided that V~ and (Р'У-1Р)-! exist, the vector of best 
unbiased linear estimators of the elements of ais (P/V-P)-1P'V-x. 

In the present problem, 


v= o то Үл 0 0 
aa ENIM wt Prid nm 
а 0. Ws Wu 


say, where Wa is the matrix formed by the first h rows and columns of 
Wa; Was, Wa; and Wu are defined similarly, also 
In 0 
Р=|1 0 |. 
0 1 


Thus 


Wi Wa sl 
Wa Wal’ 


80 that 


Wa W: 
ppp 5 T Wa Я = Wit Ws, 


Wa 
<=> 


W occupies the leading posi- 


where W;* is the kXk matrix in which 
Similarly 


tion, the remaining elements being zeros. 


P'V-ix = Wi*bi* + Wb, 
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where b;* is the Е X1 vector in which the first h elements are the ele- 
ments of bı, the remainder being zeros. 

Hence, applying Aitken's result, the best unbiased linear estimators 
of the elements of @ are given by 


b = [Wit + Wi} [Wi*b:* + УЫ. 
The inverse matrix [W;*--W.]-! exists since W,* and W, are positive 


semi-definite and positive definite and hence W,* + W, is positive 
definite. The normal equations therefore take the form 

[Wit + W|]b = Wi*bi* + Wibo. (4) 
"The variance matrix is 

V(b) = [W;* + W]. (5) 

These results are of particular interest since they illustrate the straight- 
forward fashion in which Gauss's theorem may be generalized to deal 
with the estimation of parametric vectors rather than scalars. 

In the application to the problem considered above, b; consists of 
the single element b; with variance с]. Let b; denote the vector of least- 
Squares estimates of the 8's obtained from the regression observations 
by ignoring the extraneous information, i.e. b:=(X'X) Ху, where X 


stands for the matrix of deviations from the sample means of Xj, - •· 
X. Then 


D 


V(b) = e(X'X)-1 
80 that 


1 
W: = -~ XX, 
c 
and 
1 
Web, = Sri Ху. 
с" 


Thus the normal equations (4) take the form 


1 0... bi 
оо ажр По ani yi 
тен tor: » | „ХУ 


Where for сопуешепсе we write 6 for 5, іе. 


= 
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У а нА) + воза + >> 00) ль = Хау + rb 


ЙУ, лз» +Ê at + АХ mm = У жу (6) 


АХ та REY zd +E = D ow, 


where 


g? 
Niet 
в\? 
` The variance matrix is, from (5), 


-1 


Уа? +», Dat, +++, D аль 


ТЕРІ» оа 22 


> at, ++: У a? 

The only difference from the ordinary least-squares expressions is the 
addition of à to Ул. Thus if à were known it would be no more diffi- 
cult to perform the efficient analysis than the least-square analysis. 
The difficulty is, of course, that in practice will not be known. It may, 
however, be estimated by beginning with a least-squares analysis of 
the regression data to get an estimate of c*. Confidence limits can be 
put on ХА by the ordinary variance-ratio technique. The increments 95 
to be added to the least-squares estimates may then be found from the 
equations 


(а +A) bbe Уух + «+ + + dba У mite = МЫ — bh’) 
9» тұта Tu +: + ib, та» 2.0 (7) 


У хуль 3 e+ tobe = 0 


where В,” is the least-squares estimate of В, as can be seen from (6). 
The calculation of the increments for the upper and lower conficence 
limits of will give an idea of the sensitivity of the estimates to varia- 
tions in А. 

It is worth noting that the efficient estimate of В; may be calcasited 
directly without solving the equations (6). This is done by taking the 
weighted mean of b; and the least-squares value by’, the weights being 
the reciprocals of the respective variances. Thus 
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=, (8) 


where о1/2= V(bi') and is the top left-hand element of the variance 
matrix o?(X’X)-. This value will be found to satisfy the equations 
(6). In practice о? is unknown, but an unbiased estimate of it can be 
caleulated in the usual way from the least-squares analysis. 

In any particular case we must therefore decide whether to calcu- 
late B, - - - , , simultaneously as the solution of (6) (or, equivalently 
of (7)) or alternatively whether to calculate В, from (8), the remaining 
coefficients being obtained from the last k—1 equations of (6). If X'X 
has been inverted during the least-squares analysis, so that с”? or its 
estimate is known, it will obviously be easier to calculate Ву directly 
from (8). If, on the other hand, the least-squares normal equations have 
been solved without inverting X'X it will be easier to solve (6) or (7) 
rather than invert X' X first. 


4. NUMERICAL EXAMPLE 


The foregoing results will now be illustrated by means of data on the 
consumption of pork in the United Kingdom 1920-38. We wish to fit 
а regression of the form 


Y = a #5, + В»Х, + € 


where Y is log consumption of pork per head, 
X; is log income per head, 
X; is log relative price of pork, 
В. is the income elasticity, and 
B» is the price elasticity of demand for pork. 


The analysis of a set of family budget data! yields the value bi 
70.575798 as the estimate of 8, with an estimated variance of 
0.0558764. From the annual figures of aggregate consumption, income, 


etc. we obtain the values of sums of Squares and products of deviations 
from the means, 


TN 


. For further information about the data see the article by Stone [2]. One point that may be men- 
Bis) here is that the time-series observations were transformed by taking first differences before 
calculating the sums of squares and products, in order to reduce the effect of serial correlation, 


ха 


A NOTE ON REGRESSION 807 
DX y? = 0.0352895 У ay = — 0.0023041 
> ти = 0.0086619 Play = - 0.0235779 
У) 22? = 0.0410816 Узы = 0.0070014. 
А least-squares analysis of the time-series data gives the estimates of 
В. and Өз 
b, = 0.229520 and b; = — 0.613043, 


with estimated variances, 
F(b’) = 0.190701 ава (6) = 0.040208. 

For the simple method described above we accept the value bı 
—0.575798 as the estimate of В; and take as our estimate of В» the value 
given by (1), i.e. 

— 0.0235779 — (0.575798)(0.0070014) 
29 0.0410816 
= — 0.672060. 


For the estimate of variance of b» we need 
» (y — Ылл — (ме)! = 0.035295 T (0.575798) { — 2(— 0.0023041) 
ЯЕ (0.575798) (0.0086619) } = (0.4516646)(0.0410816) 
= 0.0222595. 
Substituting in (3) we have for the unbiased estimate of the variance 
of bz, 
pd 
16(0.0410816) 
7(0.0070014)2 


кызла аканы ne o.0086619} | 
m (0.055870) { 0.0410816 


= 0.0348527. 


This may be compared with the figure of 0.0362922 obtained by sub- 
stituting the least-squares estimator of о? together with s; for es in (2). 
Тһе apparent reduction in the variance is of course due simply &exthe 


use of a more efficient estimate of 0°; the actual variance is unaffected. 


To calculate the efficient estimates of 8, and В» we need first an esti- 


Y(b) = [0.222505 
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mate of №. The estimated residual variance of the time-series data is 
0.00142427, whence 


< 0.00142427 
7 0.0558764 
Substituting in (6) we have the equations 
(0.0086619 + 0.0254897)0, + 0.00700140, 
= — 0.0023041 + (0.0254897) (0.575798), 
0.00700148, + 0.04108160, = — 0.0235779, 
which yield the estimates 
В, = 0.497328 and й, = — 0.658087. 
The estimated variance matrix is 
0.0341516 0.00700141-: 
0.0070014 0.0410816 
0.0432143 —0.0073649 
jm on ан]! 


Thus the estimated variances of the estimates of 8; given by least 
squares, the simple method and the efficient method respectively are 


0.040208, 0.034853 and 0.035924. 


These values illustrate the gains achieved by the methods developed 
above, in comparison with the least-squares method. As it happens, 
the estimated variance given by the simple method is smaller than that 
given by the efficient method ; this is presumably due to sampling fluc- 
tuations. 


= 0.0254897. 


0.00142427 [ 
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А HOLLERITH TECHNIQUE FOR THE SOLUTION 
OF NORMAL EQUATIONS 


М. J. В. Hzarx anv G. V. букв 
Rothamsted Experimental Station 


N THE critical study of the results of large-scale sample surveys it is 
frequently necessary to consider data classified in several different 
ways, and to attempt to disentangle the effects of the various classifica- 
tions which will usually not be orthogonal to one another. One way of 


-doing this is to fit constants by least squares, assuming the effects im- 


plied by the classifications to be additive; for a discussion of the meth- 
od, see Yates [6, p. 137 et seq.]. The process of fitting involves the solu- 
tion of the normal equations, a set of simultaneous equations equal in 
number to the total number of categories in all the classifications. If 
this number is at all large, the computations become very lengthy, 
and it is desirable to use Hollerith machinery, more especially as the 
main computations of the survey will often be done on Hollerith ma- 
chines, and the data will already be punched on cards. 

At least two methods of solving simultaneous equations with the aid 
of Hollerith machines have been published [8, 4]. Both employ a tech- 
nique of pivotal condensation, and demand the use of a range of 
machines outside the scope of a small installation. In the present con- 
text the large number of equations may lead to a serious accumulation 
of rounding errors, and there are advantages in the alternative tech- 
nique of successive approximation, as described, for example, by Stev- 
ens [5]. The present paper gives a method of mechanising this tech- 
nique, using only the basic Hollerith machines, the sorter and tabu- 
lator. For producing tlie working pack, a reproducer is desirable though 
not essential. 


THE METHOD OF SOLUTION 


Тһе method of solution will be illustrated on a small scale by means 
of an example with three classifications used by Stevens [5]. The com- 
putations will be set out in some detail, and the process of mechanisa- 
tion can then be briefly explained. p 

Тһе necessary data, abstracted from the complete table given by 
Stevens, is shown in Table 1. Here we have 2-way tables showing: oie 
number of units in the various sub-classes, and the total number of 
units and total “yield” for each category of the three classifications. 
From this table, the normal equations (Table 2) can be written down 
immediately; the diagonal terms come from the marginal totals, the 
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TABLE 1 
Litter Sex 
"Totals Yield 
1 2 3 4 1 2 
Diet 
3 2 2 2 6 3 9 572 
3 3 2 1 6 Б] 9 788 
2 3 2 2 5 4 9 733 
2 3 2 2 6 3 9 815 
Sex 
7 6 6 4 23 1982 
3 5 2 3 13 886 
Totals 10 11 8 T 23 13 36 
Yield 734 819 713 602 1982 886 2865 
TABLE 2 
9d, + За + 2h-F2h4-2h-- бв, 38: 572 
9d; + 304 36-204 U+ 6s-- Зв: 748 
94, + 21+ 842421,4 5+ 48: = 733 
9d,-- 2h-- 30-20-4214 бә:4- З= 815 
за: за, +2d; --2d, 4-101 + Tat 38 = 734 
2d, --3d; + За; 4-3d, +11h + 6sı+ 5s:= 819 
2d, +-2ds --2d; +24, +8 + 6814+ 2s:= 713 
241 di 2d; 4-2d, +714 481+ 8: = 602 
6d, +64: 4-5d; --6d,-- 7h+ 61, 4-61; +44 4-238, =1982 
3d, --3d; --4d; 3d,-- 31-- 5-52 +3 +13: = 886 
TABLE 3 
d dh ds d, (Я [А ls и E 82 y 
1 +833 .222 .222 .222 .667 .333 63.56 
1 .933 .333 .222 .111 .667 .333 83.11 
1 .222 .333 .222 .222 .556 .444 81.44 
1 .222 .333 .222 .222 .667 .333 90.56 
“900 .300 .200 .200 1 .700 .300 73.40 
.182 .273 .973 .278 1 .545 .455 74.45 
.250 .250 .250 .250 1 .750 .250 89.12 
.286 .143 .286 .286 1 .571 .429 86.00 
«261 .261 .217 .261 .304 .261 .201 .174 1 86.17 
(231 .281 .308 .231 .231 .385 .154 .231 1 68.15 
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other coefficients from the body of the table, while the right-hand sides 
of the equations are the total yields. The basic problem is the solution 
of these equations. Notice that they are not independent; in fact what 
we determine are the differences between diets, between litters and be- 
tween sexes. 

Тһе first step is to divide each equation by its diagonal term, obtain- 
ing the coefficients set out in Table 3. These equations with rounded-off 
coefficients are those which we actually solve, and this table eventually 
serves as а punching schedule for the working pack. As first approxima- 
tions we take the straight means of each category, which appear as 


‘the right-hand sides of the equations in Table 3. However, as only 


differences are to be estimated, we can add or subtract any convenient 
quantity from each set of values, and in practice we subtract the small- 
est value in each set from the others. Thus, subtracting 63.56 from the 
d's, 73.40 from the Гв and 68.15 from the s's and retaining 3 significant 
figures, we arrive at the first column of Table 4. Inserting these ap- 
proximations into the first four equations, we obtain improved values 
for the d's, as follows:— 

d; 63.56 — (.333 0.0 4-.222 X1.0+ · - - 47.333 Х0.0) =45.0494 

d; 88.11 — (.333 X0.0 +.333 X 1.0 + + - + +.833 Х0.0) = 65.8870 

d; =81.44 — (.222 Х0.0--.333Х1.0----. +.444X0.0) —64.8164 

d, =90.56 — (.222 «0.0 4-.333 X1.0+ + >> 47.333 X0.0) — 71.9384 


We subtract 45.0494 from each of these new approximations, round off 
to one decimal and use them in the next set of equations to get im- 
proved values of the /'s— 
1 273.40 — (.300 X0.0 +.300 X20.8+ - · · +.300X0.0) =45.2200 
1 =74.45 — (.182X0.0 +.273 X20.8+ · - - +.455Х0.0) =46.2125 
l =89.12 — (.250 Х0.0--.250 X0.0 + -.: -.250Х0.0) 258.7450 
4 =86.00 — (.286 Х0.0--143Х0.0 + · • · +.429X0.0) =59.3914 
Subtracting 45.2200 and rounding off, we go on to the next two equa- 
tions— 
sı 286.17 — (.261 Х0.0--.261 X20.8+ * * - -Һ174Х14.2) s duel 
8» «68.15 — (.231 X0.0 +.231 X20.8+ - - - +.231 Х14.2) —45.2887 
We thus arrive at the set of second approximations in Table 4. Repeat- 
ing the whole cycle we obtain 3rd approximations, and these are found 
to be unaltered by further cycles. ЖООМ 
Solutions correct to three figures have now been obtained and inview 
of the rounding off of the coefficients no further accuracy can normally 
be achieved by this technique without modification; three figures will 
in any case be sufficient in sample survey work. It is convenient to 
make final adjustments to make the mean of each group equal to the 
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TABLE 4 
Approx. Stevens’ 
186 2nd 3rd solution solution 
di 0.0 0.0 - 0.0 62.7 62.665 
di 19.6 20.8 21.0 83.7 83.078 
dy 17:9 19.8 19.8 82.5 82.426 
d, 27.0 26.9 26.9 89.6 89.555 
һ 0.0 0.0 0.0 72.4 72.399 
dh 1.0 1.0 1.0 73.4 73.391 
% 15.7 13.5 18.5 85.9 85.949 
ц 12.6 14.2 14.2 86.6 86.594 
81 18,0 18.0 17.9 88.5 88.505 
8 0.0 0.0 0.0 70.6 70.662 


general mean, and these figures are given in Table 4 together with the 
solution obtained by Stevens. 


THE HOLLERITH TECHNIQUE 


It is apparent from the cycle of iteration set out in full above that the 
basic operation is the computing of sums of products, Szy say, where 
the z's stay fixed throughout the problem (they are in fact the coeffi- 
cients in Table 3). It is natural therefore to punch these quantities on 
cards. The actual multiplications are done by successive addition, as 
on a desk calculator. Thus if 123 is to be multiplied by 456, we pass 
through the tabulator 4 cards punched 12300, 5 cards punched 1230 
and 6 cards punched 123, 

The construction of the working pack will now be described in detail. 
One set of 27 cards as described in the previous paragraph is used for 
each column of table 3, that is, for each variable, the units in the di- 
agonal being omitted. Leaving columns 1-5 for indicative material, 

_ the basic pack will be punched as follows, (—denoting a blank column) 
Col. no. i | 
Card по. 1a гү | 

) 2а 

3a 


etc. 


Each of these a cards is copied а further eight times. Nine more cards 
are then punched for each variable, with the information in cols. 6-78 
transferred to cols. 7-79; these will be referred to as cards 1b, 2b, © · +. 


- 


7 ЕР НИВА: Á—— санада ана 4 —.. —— —— 
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A set of c cards is similarly punched with the information appearing in 
cols. 8-80. 

The indicative matter in cols. 1—5 is used for sorting, controlling and 
checking. All a cards are punched X in col. 4, b cards are punched 1 
in this column, and c cards are punched 1 in col. 5. By leading these 
columns to а counter and using the “29 feature"! a check can be made 
that the right multipliers are used at each stage. Column 3 is not 
needed in the present example, but in almost all practical cases the 
number of equations will be such that two or more cards will be needed 
for each variable; these can be distinguished by punching in this col- 
шап, 

To form the multipliers, the correct number of each type of card 
are picked out by hand from the pockets of the sorter. Using always 
3-figure multipliers, the 12 pockets of the machine will hold 4 variables 
at a time, so that all cards 1—4 are punched 0 in col. 1, all cards 5-8 
are punched 1, and so on. Control is made possible by over-punching 
XY, X, Y or nothing to distinguish the variables in each set of four, 
these punches being ignored by the sorter, The punching in col. 2 is 
designed to bring the cards into the sorter pockets in the proper order, 
thus 
Punch in col. 2 ОВС, Ту ЛАЙ, Т.Н ВИ ЗО XY. 

Сагаз la 1b 1с 2a 2b 2c За 3b Зс 4а 4b 4c 
5a 5b 5с ба 6b etc. 


To start the solution, the first approximations are calculated (Table 
4, column 1). The cards are sorted on col. 1 and all cards 1-4 removed, 
since they are not needed in approximating to the d’s. Cards 5-8 are 
sorted on col, 2 and picked by hand to give the correct multipliers. 
Reference to the equations marked A above shows that the numbers 
of cards required are 


Ба 5 5 ба 6b бе 7а 7 7с Ва 8 8 
о OO 009 EA dU oe bad ани о а Вие 


these numbers are simply the first approximations to the Гв. The re- 
maining cards are sorted on col. 2 and hand-picked in their turn. 

The cards picked out are now tabulated. Cols. 4—5, 6-10, 11-15, 
16-20, 21-25 are plugged to the counters and by controlling on col. 1, 
cols. 4-5 are totalled at the end of each variable to check thatthe 
hand-picking has been done correctly. The other counters total at the 
end of the run, and the printed record shows 


1 This feature allows the punching of numbers from 0-29 in one column, the “tens” and “twenties” 
being overpunched X and У respectively. 
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10 
157 
126 
180 185106 172230 166236 186216 


which on subtraction from the right-hand sides gives the second ap- 
proximations found at A. 

Тһе multipliers for the d-variable cards are now known so these are 
sorted and hand-picked. The /-variable cards are not needed at the 
next stage and are removed from the pack. The tabulator counters are 
replugged to cols. 4-5, 26-30, 31-35, 36-40, 41-45 and the following 
tabulation gives 

208 
198 
269 
180 281800 282375 303750 266086 


which leads to the approximations found at B. The rest of the solution 
continues on the lines detailed in the previous section. 

A slight modification is possible which reduces the effect of rounding 
off the coefficients of the original equations. The coefficients are 
punched in the a pack to 5 decimals, and are reduced to 4 and 3 deci- 
mals for the b and c packs. If these packs are produced mechanically 
on a reproducer, the reduction is made without rounding off (compare 
[4], pp. 162-3). 


THE METHOD IN PRACTICE 


The method has been used in the analysis of two years’ results from 
а survey of maincrop potatoes [2]. There were 30 and 28 constants re- 
spectively representing 6 classifications of which the largest contained 
11 categories, In each case, 3-figure accuracy was attained after 4 cycles 
of the iteration. When some experience had been gained, each cycle 
took about 45 minutes to complete. Preparation of the working pack 
took about 4 hours using a reproducer; this was reduced to little more 
than 1 hour when a summary punch became available, so that the 
equivalent of Table 3 could be punched at the same time as Table 1 
was being produced on the tabulator. The complete solution thus took 
about 5-8 hours working time. The same iteration carried out on desk 
machines took about 4 hours for each cycle. 

Mistakes in hand-picking were rare, but it was found worth while to 
use coloured cards for the a cards representing the first figures of each 
multiplier. 


Some difficulty may be caused by highly correlated variables. If two 
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such variables are present (that is, if one of the non-diagonal coefficients 
is near to 1) the corrections tend to pass backwards and forwards be- 
tween them showing only slow convergence to zero. There is no par- 
ticular point in continuing the iteration for the sake of these variables 
only, as they will in fact be ill-determined and high precision in the 
solution will only be misleading. 

It is known that for normal equations the process described above 
always converges. Convergence may be slow, however, and Aitken has 
described a technique for accelerating it [1]. Its application is made 
awkward here by the fact that the corrections are adjusted at each 
stage. In the two large examples so far attempted, it has been quicker 
to run another cycle or two of the iteration, but in the other cases Ait- 
ken's method may be useful. His iteration is not quite the one used 
&bove, but the practical differences are trivial. 

We are indebted to Dr. F. Yates for the original suggestion which 
led to the method set out in this paper. 


SUMMARY 


A method is described for fitting constants to survey data by least 
squares, using a Hollerith sorter and tabulator. 
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THE USE OF RUNS TO CONTROL THE MEAN IN 
QUALITY CONTROL 


Н. WEILER 
University of Technology, Sydney, Australia 


For quality control charts controlling the mean of a normal 
population, either small samples are taken out at frequent 
intervals or large samples at less frequent intervals. It will be 
shown that in order to detect small changes of the population 
mean, the amount of inspection is greatly reduced by the 
selection of large samples, However, if for other reasons small 
samples are desirable, a control by runs of sample means 
аһоуе or below certain control limits makes it possible to use 
small samples and yet maintain the advantage of a reduced 
amount of inspection. For certain types of runs, the sample 
size n —1 turns out to be very economical, 80 that time saving 
methods of control by gauging may be introduced without 
appreciable loss of efficiency. 


1. INTRODUCTION 


С a variate x representing some measure of a mass-produced 
article, and suppose that the production has been brought under 
control. It may then go out of control in three different ways: 

(a) The mean of the population may change, which could happen, for instance 
when a tool setting gets out of position or when a tool wears out. 

(b) The standard deviation of the population may change, which could happen, 
for instance, when a fixed tool becomes loose. 

(c) The population may cease to be homogeneous, that is, elements may appear 
that do not belong to the original population. This may happen, for in- 
stance, when articles are produced by several machines; if one machine 
develops a fault, articles from that machine mzy be out of control while 
all other articles remain unaffected. 


While for a check on faults of type (a) a control chart controlling 
the mean of the population is most suitable, a control chart for stand- 
ard deviations or ranges is used to check on faults of type (b). For 

faults of type (c) both charts are useful, but it is essential that articles 
be selected from rational subgroups |1, 2]. For instance, if the same 
type of articles are produced by each of several machines, articles may 
be selected from each machine separately in order to allow discrimina- 
tion-between the various machines. 

The usual control chart controlling the mean of a population is con- 
structed in the following way: After the mean and standard deviation 
of the population have been reliably estimated, samples of fixed size n 
are selected and their arithmetic means ï=) z/n are calculated. A 
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chart is then constructed with control limits m+Bic/+/n, where m and 
c are the estimates of the population mean and standard deviation, and 
Ву=8 or 3.09. Тһе various values 2 are entered in the chart іп chrono- 
logical order, and as soon as one such value falls outside the control 
limits, production is stopped to allow investigation. 

In this paper, we shall investigate the following alternative control 
method. Instead of stopping the production when a single 2 value falls 
outside the control limits m+B,c/+/n, we may calculate a pair of nar- 
rower limits m + Взс/ уп and stop production as soon as two successive 
& values fall above the upper or below the lower of these limits. More 
generally, we may calculate a pair of limits m+B\o/+/n such that we 
may stop production as soon as А successive 2 values fall above the up- 
per or below the lower of these limits. In each case, B, is determined 
such that if the population mean does not change, an average of 1000 
samples is necessary to produce one run of А successive 2 values above 
the upper (or below the lower) control limit. Thus, in each case, a false 
alarm can be expected about once in every 500 samples tested. On the 
other hand, if the population mean does change, the amount of inspec- 
tion required to detect a given change will depend on À and л. 

It has been shown in a previous paper [3] that for А —1 the most eco- 
nomical sample size (that is, that value of » which would lead to the 
detection of a given change of the mean after a minimum of inspection) 
is much larger than the sample sizes usually used in quality control. 
Nevertheless, since small samples lend themselves readily to the detec- 
tion of faults of type (с), the quality control engineer may be reluctant 
to abandon them in favor of larger samples. It will be shown that for 
a check on faults of type (a) the use of runs makes it possible to retain 
small samples without an appreciable loss of efficiency. 

With the exception ‘of а paper by Olmstead [4], in which runs are 
terminated whenever an observation turns out to be one of a specified 
kind, recent publications on runs deal mainly with runs within samples 
of fixed size [5, 6]. In particular, the theory has been applied to prob- 
lems of quality control in the form of runs above the sample median 
[7, 8], and runs up and down [9, 10, 11]. The runs in this paper differ 
from those of the other publications in that they are not related to a 
fixed number of observations. They constitute a test similar to a se- 
quential test [12], where the number of observations is not predeter- 
mined but depends on the outcome of the observations themséives. 
Although little mathematical research seems to have been done in this 
field, the method has been used intuitively by quality control engineers 
[13, 14]. 
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2. DETERMINATION OF CONTROL LIMITS FOR RUNS 
Definition 
Consider a sequence of trials where each trial may or шау not pro- 
duce an event E. If a particular trial produces the event E, we shall 
call it a success. A sequence of А consecutive successes not preceded by 


a success is called a run of Х successes. 
We shall make use of the following theorem [15]. 


Theorem 


If p is the probability that а single random trial results in a success, 
then the expected number з of independent trials required to obtain a run of 
) successes 18 в = (1—p)/p(1— р). 

Using this theorem, we may solve the following problem. 


Problem 


Let т be а normal variate of mean m and standard deviation с, and 
let $—» z/n be the arithmetic mean of а sample of n independent т 
values. Every 2 is called a trial and every 22 m--Bc/4/n a success. 
Determine B such that in the average, 1000 trials are required to obtain one 
run оў №: successes, 

Let p be the probability that a random trial 2 gives 22 m--Bc/ ут. 
Solving the above problem for \ = 1, we have s= 1/р= 1000, or p —0.01. 
Since 2 is normally distributed with mean m and standard deviation 
с/ ут, we obtain B=3.09 from a set of normal probability tables. 
Thus, if the 2 values are entered in a control chart with control limits 
m3:3.09c/-/n, and if the population mean and standard deviation re- 
main unchanged, we can expect that an average of 1000 trials will be 
required to obtain one trial above the upper control limit. Similarly, an 
average of 1000 trials will be required to obtain one trial below the lower 
control limit, so that an average of 500 trials сап be expected to pass, 
before a “false alarm” or type I error [5] occurs. 

In а similar manner, we may solve the problem for \=2. This gives 
з= (1-p)/p* — у24-у= 1000, where y— 1/p. Solving this equation, we 
obtain y=31.127 and р=0.03213. The normal probability tables 
give В= 1.85. Thus, if 2 is entered in a chart with control limits m+ 
1.850/ Vn, and if two successive 2 values above the upper or below the 
fowor limit are regarded as significant, we can again expect that 500 
trials will pass before a false alarm occurs. 

For A=3, the equation reduces to s=y+y?+y?=1000, which may 
be solved by any numerical method. Using Newton’s method, we find 
easily y=9.645 and р=0.10368, and we deduce B= 1.26. 
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In this way, we calculate B for \=1, 2, 8, · - - , 9, and obtain the fol- 
lowing values. 


TABLE I 
CONTROL LIMIT FACTORS FOR A=1, 2,---,9 
^ 1 2 3 4 5 6 7 8 9 
y | 1000 31.127 9.65 5.31 3.742 2.953 2.494 2.199 1.995 
р | .0010 — .0321  .1037  .1873 .2672  .3856 .4010  .4547 .5013 


B 3.09 1.05 1.26 0.89 0.26 0.42 0.25 0.11 0.00 


In each case, if we regard a run of à values above the upper or below 
the lower control limit as significant, we can expect an average of 500 
trials to pass, before a type I error is committed. 


з. THE AVERAGE AMOUNT OF INSPECTION FOR А GIVEN CHANGE 
OF THE POPULATION MEAN 
Let z be a normal variate and suppose that the control limits 
т + Bc/4/n are adopted for the arithmetic mean = ? 72/n. If the pop- 
ulation mean changes from p=m to u — m--ke(k 0) while the standard 
deviation с remains constant, the probability that 2 exceeds the upper 
control limit is (see also [8]): 


Р = Рг {22 т + Be/vn|u = m + ke] 


pr FoR aM a B vile = m + bo} (1) 
= Pr {г zB — kvn}, 


where г is the standardized normal variate (mean zero, standard devia- 
tion one). ) ) 

If we regard a run of А values above the upper control limit as sig- 
nificant, we shall in the average require 5= (1 —P)/P*(1— P) samples 
to detect а change of the mean from и=т to u 7 m--ke. The corre- 
sponding number of articles to be tested is 


п(1 — P") 
и, 2 
AM = Ра - P) M 
where 
P 


exp (— 1z?)de. (3) 


F Vind в-ауа 
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The value of n for which А (п) is а minimum may be found by solv- 
ing the equation dA/dn=0. This has been done іп [3] for \=1 and can 
also be done for \=2, 3, - - - , but a direct calculation of A (n) for vari- 
ous values of n is less tedious and more instructive. 


EE 
W 


Fig, 1. Average Amount of Inspection for \ =1; B —3.09; n —1, 5, 10, 20, 40. 


4. GRAPHICAL REPRESENTATION AND DISCUSSION 


Equations (2) and (3) show that when п, ^, and B are given, the av- 
orage amount of inspection А (п) is a function of k, which can be read- 
у calculated with the help of a set of normal tables. The variation of 
A(n) as a function of k is shown in Figures 1, 2, 3, 4, for various values 
of n and à. Since A(n) increases rapidly with decreasing k, the use of 
semi-logarithmic paper was found to be convenient. 
І шау be seen from Figure 1 that with the conventional control 
chart Q=1), small samples usually require a much greater amount of 
inspection than large samples. In particular, the sample size n=5 is 
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economical only when the population mean changes by more than (say) 
one standard deviation. The sample size n= 1 is particularly uneconom- 
ical unless k is very large. 

Figure 2 gives the average amount of inspection required when two 


Ета. 2. Average Amount of Inspection for A —2; B -1.85; n —1, 5, 10, 20. 


successive Z values above the upper (or below the lower) control limit 
are regarded as significant. It shows that here the amount of inspec- 
tion by means of small samples is greatly reduced. In particular, for 
k—0-4 and sample size n=5, the average amount of inspection is 380 
for the conventional chart and only 210 for the chart with \=2. Тһе 
sample size л = 1, although still uneconomical, is more economical than 
with the conventional chart. Шыт 
Figure 3 shows that a chart with \=3 represents a further improve- 
ment for small samples, while large samples become uneconomical. 
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Figure 4 shows that for \=6 the sample size n=1 is very economi- 
cal. This is an important result, because the high efficiency of a control 
by runs of individual values makes it possible to use gauges where oth- 
erwise costly measurements are required. The loss of efficiency that 
testing by gauges usually entails is here avoided. 


Юа. 3. Average Amount of Inspection for А «3; B 1.26; n —1, 5, 10, 20. 


A similar graph for À—9, В=0 would show that the efficiency of 
sample size n —1 is about the same as for a chart with A=6. Тһе ad- 
vantage of taking А--9 rather than 4 —6 would be that В is equal to 
zero 80 that no control limits need to be calculated. 


5. THE CHOICE OF THE MOST SUITABLE CONTROL CHART 


ic Since for any given value of B, the probability P defined by equation 
(8) is a function of руп alone, the expression 


(Уп): — P?) 


k*A(n) = P -P) 


(4) 
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5, 10. 


n=1, 


i 


=0.42; 


B 


Й 


Fia. 4, Average Amount of Inspection for А =6; 


=1, 2, 3, 6, 9. 


Fia 5. Average Amount of Inspection for X 
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remains constant as long as Кул and А are kept fixed. For any given 
value of ^, this expression is thus a function of the one variable k+/n. 
It is easy to calculate the values of the function for any values of ky/n 
and to plot the corresponding curve. This has been done in Figure 5 for 
^-— 1, 2, 3, 6, 9. The curves show clearly that the conventional chart, 
based on Х = 1, is economical only when k/n is greater than (say) 2.5. 

'This means that the conventional chart is most efficient in a range 
that is usually of little interest. If, for instance, the sample size n—4 
is used, the conventional chart is efficient only when the population 
mean changes by more than 1.3 standard deviations. A chart with 
Х-2, on the other hand, would then be very efficient for changes of 
between 0.8 and 1.5 standard deviations and is superior to a chart 
with \=1 for any change up to 1.4 standard deviations. The saving of 
inspection may be anything up to 40%. 

The saving is even greater (up to 50%) when \=3 is used, but the 
range for which such a chart is most efficient is somewhat reduced. 
When \=6 is used, the saving may be anything up to 60%. However, 
the range of high efficiency is further reduced and the chart becomes 
rather inefficient when k+/n>2. 

The case \=9 is of special interest, because В is equal to zero. This 
means that no control limits need to be drawn. A chart then becomes 
unnecessary, and gauging methods may be adopted when the sample 
size n= 1 is adopted. Moreover, any of the above charts may be com- 
bined with the observation of individual articles. Production should be 
stopped to allow investigation as soon as A successive 2 values fall 
above the upper or below the lower limit, or when 9 successive single 
values fall above or below the population mean. 


6. CONCLUSION 


It has been demonstrated that the sequential use of runs for control 
charts controlling the mean leads to great saving of inspection, and 
that it will, in many cases, be of advantage to introduce it instead of 

the conventional chart. The conventional chart is to be preferred only 
when large samples are not a disadvantage or when sequential methods 
“are undesirable. 
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TRUNCATED POISSON DISTRIBUTIONS 


Paut Б. RIDER 
Wright-Patterson Air Force Base and Washington University 
This paper gives a method of estimating the parameter of 


a Poisson distribution which has been truncated at the lower 
end. Application is made to a number of actual examples, 


INTRODUCTION 


ANY studies have been made of truncated distributions. (See [2] 
M and the references contained therein.) Of the continuous type, 
the normal distribution and the Pearson system of distributions have 
been rather thoroughly investigated, Of discrete distributions, the bi- 
nomial has been studied by Finney [3]. 

Yule [6] has considered an interesting type of distribution which he 
met in studying vocabulary. This is the number of words occurring 
once, the number occurring twice, and so on, in a specified work of a 
certain author. The distribution is somewhat similar to a truncated 
discrete distribution, in that there is no frequency corresponding to 
the number of words occurring zero times, Obviously there can be no 
frequency corresponding to the zero class unless it can be assumed that 
the total number of words in the author's vocabulary is known. The 
frequency of the zero class would then be those words in his vocabulary 
which were not used in the particular work under consideration. 

Other examples of truncated discrete distributions can easily be 
thought of, Consider, for example, the distribution of number of traf- 
fic violations. There will be certain persons who have received 1 ticket, 
some who have received 2 tickets, some 3, and so on. There will be no 
record of those who have received no tickets. 

The present paper considers another discrete distribution, the Pois- 
son, 


As is well known, the Poisson probability function is the function 
Ds = eal а) 
This gives the limit, as the number of trials approaches infinity but 
the number of expected occurrences А remains constant, of the proba- 
bility that ап event will occur exactly т times, x ranging over the non- 
negative integers. 
The function contains the single parameter A, to which, incidentally, 
each and every semi-invariant of the Poisson distribution is equal. Tip- 
рей [5] and Bliss [1] have considered the question of estimating this 
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parameter when the frequencies of those classes corresponding to val- 
ues of 2 above a certain specified value have been pooled. Fisher and 
Yates [4], p. 1, have shown that for an even number of degrees of free- 
dom, the probability of exceeding a given value of x? is reducible to à 
partial sum of a Poisson series, i.e., a Poisson series with the upper end 
truncated. The present paper gives methods of estimating ^ when some 
of the data in à sample are missing, partieularly when the lower end is 
truncated. 


ESTIMATING THE PARAMETER FROM TWO CLASS FREQUENCIES 


If a sample is truly Poisson in character, the value of à can be esti- 
mated even when only two different class frequencies are known. Let 
us designate by f. the frequency with which the value 2 occurs in the 
sample. Then the expected value of f; is Npa, where N is the number 
in the sample. If we use the observed frequencies of two different 
classes as estimates of their expected values, we are led to the equation 

k—m 
Л zs mw à (2) 
ТА k! 


which is easily solved for А. 


ESTIMATING THE PARAMETER FROM A TRUNCATED SAMPLE 
We wish now to consider the case in which one or more classes at the 
lower end of the sample are missing. We shall use the following nota- 


tion: 
E E 


T= Xj T= Le T= У 2, (3) 
k k k 
where k is the number of missing classes. Further, let 
%-1 


Ty’ = МУ р: + То, 
0° 


k—1 


T; = No sp, t Ty (4) 
0 


5—1 
T; = МУ, gp. + Ts. 
y є < 
Then T;'/T» is an estimate of the mean à, and similarly, Т2 / Ту is ап 
estimate of the second moment of the distribution about the origin, 
viz., AHA? 
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We therefore set 
Ty zT T? = (A+ \?)Т»/. (5) 
Substituting from (4) and (1), we are led, after some reduction, to the 
following equations: 


Ti ny АТ, Ld PEE NBN a, (6) 


А» 
Т. — A + ^)Ть = ҮК (k +»). (7) 


Solving these simultaneous equations for A, we get 
Т.- ЁТ, 
Tı — (k — 1)To 


When Х has been estimated from (8), all missing f+ can be estimated, 
as can the total frequency. 


A= (8) 


EMPIRICAL SAMPLING 


As а test of how good an estimate of А is provided by (8), samples 
of size 100 were drawn, by using random numbers, from populations of 
10,000, conforming as closely as possible to Poisson distributions. The 
following values of А were used: 0.5, 1, 2, 8, 4, 5. These samples are 
shown in Table 2. The first column gives the values of z. The second 
column, headed A —0.5, gives the frequencies, for the respective values 
of 2, in the sample drawn from the Poisson population in which the 
parameter А has the value 0.5; the column headed \=1 gives the fre- 
quencies in the sample drawn from the population in which the param- 
eter has the value 1, and so on. 

We shall denote by X the estimate of X obtained from (8) with k=1, 
and by A" the estimate of \ obtained from (8) with k=2. Values of XM 
and № are recorded in Table 2. 


COMPARISON WITH MAXIMUM LIKELIHOOD ESTIMATES 


Tt can be shown that the maximum likelihood estimate of А is given 
by the solution of the equation 


Nm kl туз 


n® = 


(9) 


TRUNCATED POISSON DISTRIBUTIONS 
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where »,(? is the first moment of the truncated sample. In particular, 


we have 
à 1-е 
у = , n” = ас ) Н (10) 
И ыг Те — № 
TABLE 1 
A n n" À n n" > т! n” 

0.1 1.051 2.034 0.9 1.517 | 2.347 || 1.7 | 2.080 2.742 
0.2 | 1.103 | 2.069 1.0 | 1.582 | 2.392 | 1.8 | 2.156 | 2.797 
0.3 | 1.157 | 2.105 1.1 | 1.649 | 2.438 | 1.9 | 2.234 | 2.854 
0.4 | 1.213 | 2.142 1.2 | 1.717 | 2.486 | 2.0 | 2.313 | 2.911 
0.5 1.271 2.181 1.8 | 1.787 | 2.534 | 2.5 | 2.724 | 3.220 
0.6 1.330 2.221 1.4 1.858 | 2.584 | 3.0 | 3.157 | 3.560 
0.7 1.391 2.262 1.5 | 1.931 | 2.635 | 4.0 | 4.075 | 4.323 
0.8 1.453 2.304 1.6 2.005 | 2.688 | 5.0 | 5.034 | 5.176 

TABLE 2 


SAMPLES FROM POISSON DISTRIBUTIONS 


т 120.5 1-1 х=2 A=3 л=4 Х-5 
0 54 47 20 6 и 0 
1 32 26 31 у 9 7 
2 12 14 25 32 21 5 
3 2 9 10 14 20 10 
4 4 10 19 17 28 
5 3 15 10 1 
6 1 4 8 16 
7 i 2 8 14 
8 1 2 6 
9 1 2 

10 1 

м 0.58 1.34 1.86 3.02 3.70 4.08 
Nt 0.38 1.34 1.95 2.93 3.78 4.65 
ni! 1.35 1.83 2.15 3.30 3.73 4.84 

У «m 

и” 2.14 2.63 2.88 3.48 4.01 $.13 

Ld 0.63 1.86 1.79 2.71 3.62 4.80 

LV 0.40 1.49 1.94 2.89 3.59 4.94 
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То assist in solving these equations, values were assigned to А, and 
the corresponding values of и; and vy’ were calculated. Results are 
shown in Table 1. From this table, for a given value of v or vi”, the 
maximum likelihood estimate, X or ХА”, of the parameter А can be ob- 
tained by interpolation. The values for the samples obtained in this 
study are exhibited in Table 2. 


CONCLUSION 


As judged by the limited number of samples in this study, the esti- 
mates of à provided by the suggested method seem in most cases to be 
somewhat better than those provided by the method of maximum like- 
lihood. This is particularly true when only the lowest class is missing. 
Moreover, the method is quite simple and direct, while the method of 
maximum likelihood requires the solution of equation (9) either by 
trial and error of by the use or tables similar to Table 1. 
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ADDENDUM 


Attention should be called to a paper by F. N. David and N. L. 
Johnson, “The Truncated Poisson,” Biometrics, 8 (1952), 275-85, 
which appeared after my paper was submitted for publication. The 
authors consider the special case k=1 of the estimator which I have 
proposed. They show that it has an efficiency less than 1. The efficiency 
has a minimum value of about 70%, which occurs in the range \=2.5 
t@\=20, and approaches 100% with increasing А. 
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PERCENTAGE POINTS OF THE INCOMPLETE 
BETA FUNCTION 


Вовевт E. CLARK 
The Pennsylvania State College 


ТЕ table presented here gives to four significant figures the values 
of p — P(N, X, a) defined by 


N 


«= >( “Уа = p)* = 1,(X,N —Х+Е1), 
rex NT 


for N=10(1)50, X 21(1)N, and a=.005, .010, .025, and .050, where 
I,(X, Y) is Karl Pearson's incomplete beta function ratio.’ Values of 
p for which 1,(Х, Y) —.005, 010, .025, .05, .10, .25, .50 have been given 
by Thompson? in terms of the arguments y, =2У and v —2X. The en- 
tries in the present table were obtained by inverse linear interpolation 
of the logarithms of the accumulated frequencies found in Pearson's 
Tables of the Incomplete Beta Function and by interpolation with 
Lagrangian coefficients of Thompson's percentage points of the incom- 
plete beta function. For values of p 7.2000 these two methods of inter- 
polation gave results which agreed within two units in the fourth 
significant figure, in spite of the fact that for v2>30 in Thompson’s 
tables double interpolation was employed. Thompson’s tables for 
р <.2000 were worked twice to insure accuracy, and then were accepted 
as accurate. For values of p>.2000 the data were smoothed by taking 
fourth differences, staying within the limits set by these two methods of 
interpolation. The data are therefore felt to be accurate within +1.5 
in the last significant, figure. 

Since binomial sums and the incomplete Beta function appear fre- 
quently in statistics the table may be used in a number of problems: 


1. Confidence limits for binomial variates may be obtained directly from the 


table. р 
2. The table gives some percentage points for the incomplete Beta function 


which are not given by Thompson.* у . 
3. The values in the table are the lower percentage points of all order statis- 
ties in samples of size from 10 to 50 inclusive. From these values it is possi- 


ble to obtain the corresponding percentage points of any continuous dis- 
tribution by the method given by Curtiss.* de 
Biometrika Office, University 


1 Tables of the Incomplete Beta Function, edited by Karl Pearson, 
College, London, W.C. 1. 
3 Catherine M. Thompson: 


(1941) 168-81. ; Е 
3J. H. Curtiss: “Convergent sequences of probability distributions,” American Mathematical 


Monthly, 50 (1943) 103-5. 


“Percentage points of the incomplete beta function,” Biometrika, 32 
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4. The .05 column of the table is an extension of a part of the table given by 
Grubbs.* 

5. Percentage points of the F distribution for a=.005, .010, .025, .050; 
Ті =2(2)100, m2=2(2)100 for nı+nı—2<100 may be obtained from the 
present table by making the transformation® 


< m(l — р) 
mp 


where p =P[4(nm1-+n2—2), пз, а]. For example Р.о for п: =10, п: =20 is 
2.35 since p=P(14, 10, .05) =.46 


F 


These are only a few of the applications which may be made of the 
table which is presented here. The reader will probably know of others 


TABLE 1 


PERCENTAGE POINTS OF THE INCOMPLETE BETA FUNCTION 
(times 10,000) 


t CC O E E Т 


N X  .05 .00  .05  .05 N X  .05  .00 .0%5 .050 
EUN ағы TVG RSE TROU RES PAR Лү: 2 

10 1 501 1005 25.20 51.16 7 2085 2349 2767 202 
2 1085 1554 252.0 307.7 8 205 3024 3489 300 
8 301 475.1 667.4. 872.6 9 3з 378 4081 427 
4 707 932.1 1210 1500 10 4270 407 509 5619 
5 08 1504 1871 224 11 5090 505 652 6з 
9. 190 218 2004 3035 12 мй бз зи 7791 

7 9 291 3476 394 
8 208 3888 449 401 13 1 385 7.728 19.40 3938 
9 4557 4056 5550 60% 2 82,52 118.2 102.1 — 280.5 
9 — 85897 6310 005 тп 3 083 357.8 503.8 000.5 
: 4 508 694.6 909.2 127 
пт 4560 9.13 22,9 46.52 5 942.3 1108 1880 1657 
2 9820 14.7 228.3 333.2 6 1983 1588 1922 2239 
8 594 428.2 602.2 788.2 7 1887 2129 2513 2870 
4 684 89.6 1008 1351 8 2454 27% 3158 388 
5 145 — 1344 105 1996 9 3087 3301 387 4274 
6 103 190 238 27 10 3794 4122 49 — 6054 
| 7 22 202 8079 — 3408 П 4590 409 55 580% 
8 3:07 3300 390 43% 12 550 5872 6397 6897 
9 3015 — 4277 — 4822 5209 13 6653 707 7530 7М2 

10 404 5800 5872 6356 
11 6% 609 701 7616 14 — 1 3.50 7.170 1807 36.57 
2 164 109.5 178.0 200.0 
В 1 416 8872 2108 42.65 3 2171 330.6 4658 611.0 
2 89.68 128.5 208.0 304.6 4 525.9 640.3 838.9 101 
8 390.4 38.8 586 718.7 5 860 109 1270 1527 
4 624.0 759,0 99.5 1229 6 1207 157 1766 201 
5 104 1215 187 180 7 1724 1947 2304 2636 
в 1522 — 1740 2100 2453 8 2234 2488 2886 3250 

=> 


«Frank E, Grubbs, “On designing si NE СЕНІ? 
tistios, 20 (1949), 249-56, — enin Single sampling inspection plans,” Annals of Mathematical Sta- 
* Maxine Merrington and Catherine M, Thompson: “Tables of i i d 
B i y * : percentage points of the inverte 
beta (F) distribution,” Biometrika, 33 (1943), 73-88; and C. J. Burke: “Computation of the levels of 
significance in the F test,” Peychological Bulletin. 48 (1951) 302-97, 


INCOMPLETE BETA FUNCTIONS 


TABLE 1—(сопі) 
—— — IIIS ARE с а Se een АЎ 


м X 0% 20  .05 .0% N X 005 00  .05 09 
9 279 300 3514 зон 18 6% би 7131 — 7499 
10 зл 3726 4190 — 490 17 7322 707 — 807 8и 
il 4108 4438 400 533 
12 487] 527 519 6М6 158 1 2.18 5.582 1406 28.46 
13 5780 609 биз 7053 3 58.90 84.57 1375 201.1 
14 6849 — 7197 764 8074 3 1970 253.6 357.0 470.2 

4 402 488.0 640.0 796.9 

15 1  8.M1 6.08 16.80 3414 5 — 6094.4 771.9 9805 1104 
2 T7LI7 10.0 165,8 242.3 6 9507 1006 194 1508 
3 238.0 307.2 433.1 508.5 7 шю ма 1700 1089 
4 487.0 593.9 778.7 90.6 8 1650 184 25 — 240 
5 801.1 943.6 1182 117 9 206 2269 202 2012 
6 109 148 и 1909 10 244 210 306 8406 
7 1587. 1795 207 2487 11 9 3186 355 990 
8 3001 2987 2650 300 12 — 3420 3691 4000 4400 
0 256] 2829 3229 3500 13 з 4008 4652 5022 
10 — 318 мз 3838 4% 14 498 4901 5% Мп 
поз 4031 4490 4 15 506 5417 5658 63 
2 495 475 501 5002 18-77 5982 76008 600) 6807 
1 5137 из Би 60% Пп 0587 6840 — 771 — 7623 
и 5984 бл 6805 7206 18 — 74509 743% 8147 867 
ш м м с тар YU a joes аа ви и 

2 55.81 80.01 1301 19.3 

16 1 3.132 6.280 15.81 32.01 3 18.2 209.6 3383 447 
2 00.58 9.4 1551 2268 4 3907 4606 6052 752.9 
3 223.1 287.0 447 5315 5 668 727.8 04.7 1000 
4 454.5 553.8 72.8 902.5 6 805.0 102 1258 1475 
5 745,4 884 — 1100 1321 т 127 1868 1629 — 1875 
6 186 12501 1520 178 з 1599 102 2005 2007 
7 ип 105 105 2267 9 — 199 214 245 02789 
8 1897 2117 2405 2786 10 3310 250 2886 3201 
9 2362 207 2988 3334 п 240 2980 3350 3681 
10 2808 3134, 3543 290 19 301 — 3447 386 481 
и 3415 3701 4134 4517 13 3671 3940 4345 4700 
12 4009 40 4762 5156 и 4182 4462 4880 5242 
13 4656 ӨП 5435 5834 5 479 5018 5444 5809 
M 5020 56 6165 6502 10 5318 5018 6040 6400 
15 6180 6512 6077 17360 17 юз 6250 0080 7042 
16 7181 / 7499 7941 8293 18 6080 6082 7397 7736 

19 7567 7848 8095 8541 

и 1 298 5.910 14.88 30.13 
2 62.56 80.67 145.8 213.2 20 1 2,506 5.024 1265 25.01 
3 209.2 269.2 379.9 499.0 2 52.05 7540 123.5 180.7 
4 495.0 518.8 681.1 846.4 3 176.4 227.1 320.7 4017 
5 670 821.7 101 1238 4 357.6 4382 573.3 713.5 
6 шм 1168 1421 — 104 5 582 688.4 86.7 1041 
7 оо 15520 184 219 6 845.5 954 1189 135 
8 опи оп 2x8 9901 7 109 02 150 173 
9 219 202 2781 3108 8 1460 104 1912 217 
10 266 206 308 360 9 1906 2001 23% E 
11 3154 3423 3833 4197 10 2177 2390 272 3 
12 300 905 404 4781 и 2570 2801. 31520 3469 
13 42958 4566 5010 53% 12 200) 3224 3005 306 
14 486 5204 567 — 604 з ии 3600 бз 4420 
15 5587 591 6950 6758 м — 394 ап 42 402 
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-0%5 .010 .025 .050 N x -005 .010 .025 .050 


7673 7943 8316 8609 10 1848 2031 2319 2582 
11 2176 2374 2682 2961 
2.387 4.785 12.05 24.40 12 2521 2783 3059 3352 
50.37 72.22 117.5 171.9 13 2884 3108 3450 3754 
10.7 215.9 304.9 401.0 ч 3264 3499 3854 4169 
339.5 414.2 546 681 15 3602 3900 4274 4596 
553.3 6532 821.8 988.5 16 4079 4331 4708 5036 
801.2 924.7 1128 1324 17 4517 4776 5160 5490 
1078 1224 1459 1082 18 4978 5242 5630 5961 
1381 1546 1811 2057 19 5466 5733 6123 6451 
1707 1891 2182 2450 20 5988 6256 6641 6904 
2055 2257 2511 2858 2 6554 6819 7196 7507 
2425 2642 2978 3281 22 7186 7443 7805 8098 
2815 — 3047 3402 8719 23 7942 8185 8518 8779 
3228 3472 3844 4172 
3663 3819 4303 4641 24 2.088 4.187 10.54 21.35 


48.03 068.88 112.0 10.0 п 2070 2260 2555 284 
159.7 205.7 290.6 382.2 12 2390 250 2912 3194 
323.1 304.3 — 518.7 ' 646.0 13 2738 29538 292 3576 
526.2 621.4 782.1 941.1 м 3096 33220 3604 3968 
761.3 878.9 1078 1200 15 3470 3705 — 4059 — 4371 
1004 16 1380 1599 16 3860 4104 4468 4786 
1910 — 1468 1720 19% 17 4268 — 4519 4891 53 
1018 — 1700 2071 2327 18 496 — 4952 — 5329 5053 
1946 — 2138 — 2439 2713 19 5145 505 575 6109 
2293 201 2822 313 20 5621 5882 6202 6582 
2000 2881 3221 3526 21 6127 0388 674 7078 
3048 3280 — 3636 3952 22 6676 6934 — 7300 7602 
3451 — 3000 4066 4301 23 7287 . 7539 7887 8171 
3877 4132 4513 4846 24 8019 8254 8575 8827 
4326 4588 4078 5315 
4799 5068 5463 5802 25 1 2.005 4.019 10.12 20.50 
5801, 574 — 5972 6309 2 42.16 6046 98.39 14.0 
5839 — 6113 6500 6841 3 139.9 180.2 254.7 335.2 
6423 665 704 — 7405 4 282.8 344.7 453.8 565.6 
7076 — 7342 7716 8019 5 489 542.2 6083.1 822.9 
780 811 — 8450 8727 6 662.5 765.5 935.0 1101 
7 88.9 100 1207 1395 
2.79 4.300 11.00 22.98 8 1135 1273 1495 1703 
45.9 65.82 107.1 156.7 9 1309 153 1797 2024 
162.5 164 277.5 365.2 10 109 1848 2113 235% 


в 
E 
> 
е 
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INCOMPLETE BETA FUNCTIONS 
TABLE 1—(cont.) 


835 


26 


27 


X .% 20 205 (9 N X 059 00 005 .050 
20 2284 мз 270 301 5 3002 3213 3593 3816 
зп 97 2814 311 ми 16 зи 350 3880 4171 
14 6 3103 20% 3786 17 8672 308 437 45% 
15 3% 205 3867 4168 18 405 457 404 4005 
16 3005 3900 4252 450 19 4% 40 4082 5286 
17 4048 9 4651 — 4904 20 4740 506 532 5677 
18 4447 4694 5062 5378 21 5173 5417 5774 6079 
19 48и 5116 5487 5905 22 5589 585 60 — 6494 
20 58020 5557 550 — 6246 23 8027 6273 6627 09024 
21 5765 6021 6392 — 6704 21 бз 6736 704 733 
2 езт 612 608 7183 5 6% 72:4 7571 7М7 
28 бю 7041 797 700 20 7554 7783 803 830 
24 1802 7605 7905 8230 27 808 8432 873 8050 

25 800 8318 808 — 8871 
28 — 1 170 3.580 9.038 18.90 
1 1.98 3.865 9.733 19.71 2 3757 53.88 8770 1284 
2 405 5810 94.55 138.4 8 144 160.3 220.7 298.5 
з 143 173.0 246 3220 4 250.7 308.2 434 508.1 
4 21.0 330.8 435.0 543.1 5 400.8 480.9 606.4 731.1 
5 401 501 655.5 789.8 6 596.5 678.0 829.6 976.9 
6 — 65.1 733.9 897.4 106 7 785.6 80.5 1060 127 
7 851.6 908 — 1157 1338 8 102 1025 1322 1500 
8 07 120 мз 103 9 1992 1870 1588 170 
9 1333 мат 171 190 10 ит 1627 184 2082 
10 1605 1708 2003 2257 1 13 196 250 2383 
n 1887 2000 2335 254 12 20 2176 240 2091 
12 2181 2860 2059 201 13 2 246 — 2751 3007 
13 2489 2% 2993 32% и 252 60 3065 3881 
14 2800 3018 3337 261 15 — 28/4 3078 3337 2602 
15 зиз 3301 302 394 16 3186 3% 3718 4000 
16 3499 3716 4057 4557 i7 350 279 408 4346 
17 — 3850 404 — 4433 419 18 385 400 407 470 
18 4% 4465 4821 510 19 40 40 46% 5062 
19 405 680 52 552 20 450 4780 5134 5433 
20 50% 5271 5% — 5946 21 405 502 5513 5813 
21 5450 50 605 — 6374 22 5313 5554 5005 6203 
22 5%0 6151 6513 6818 23 м0 мй 601 6406 
23 679 606 0085 721 24 бт 6087 6734 703 
24 6% та 7487 ТП 25 бй 608 707 749 
25 701 тот 80386 — 8301 26 7089 701 740 708 
20 8156 87 8677 8012 27 7631 7854 8165 815 
28 8276 8483 8760 8085 

1 1.856 3.722 9.373 18.98 
2 38.0 55.90 91.00 133.2 29 1 1128 3.465 87% 17.67 
3 129.2 166.4 235.3 300.8 2 36.25 52.00 8404 123.9 
4 260.4 318.0 418.0 522.3 3 12.0 154.6 218.6 288.0 
5 628 497 630.0 150.3 4 241.7 2952 389.0 485.2 
6 09.8 704.9 802.2 1015 5 390.0 463.4 5840 704.9 
т 817.3 993 11 1285 6 540 6532 700.44 01.6 
8 102 170 1375 1568 7 1756.4 804 100 1102 
9 1283 106 162 1802 8 99.9 1083 123 145 
10 158 165 1М0 2166 9 185 118 189 5 
n 1807 106 2999 2479 10 140 155 174 2005 
12 2088 2268 2548 2801 1 1606 1823 2069 2293 
13 9381 202 207 311 12 1923 2001 292 299 
14 26 2887 3195 940 13 2191 2369 265 2893 
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N x 005 010 .025 .050 N x 005 .010 .025 -050 


9 
10 
и 
1 
13 
14 

20 4349 4578 407 5210 15 2549 2783 3015 3267 
16 
17 
18 
19 
20 


26 6702 6934 7265 7539 21 4315 4536 4863 5146 


27 ПП — 7404 70% 795 22 442 486 519 548 
28 74 701 — 8224 — 8408 23 4080 50% 5550 58% 
20 8830 832 8808 9019 24 5320 - 5557 5801 6174 
25 5602 500 6253 6и 
3 1 1.671 3.349 8.438 17.08 20 — 0070 0298 607 6004 
2 35.00 50.24 81.78 119.8 27 0407 6008 707 7286 
3 15.0 149.3 211.3 278.2 28 687 7100 7405 7685 
4 283.3. 285.0 375.5 468.5 9 798 754 7858 8105 
5 382 447.2 564.3 650.5 30 . 7897 8043 899 855% 
8 — 54.8 030.1 774. 908.8 31 М9 82 8878 0079 
70 70.2. 829.7 99.4 1% 
8 928.9 104 1298 1402 32 1 1.50 3.140 7.009 16.02 
9 142 170 143 108 2 3281 47.06 76.61 1122 
10 1807 1508 1720 193 3 1084 199.7 197.7 200.4 
m 1004 1755 — 199 220 4 218.2 266.5 351.3 4385 
12 1850 2013 2268 2495 5 353.4 418.0 527.5 6365 
13 — 2107 2280. 2546 2787 6 508.7 584 720.8 849.6 
4 233 2555 2984 305 7 604 774.4 927.7 1074 
15 2648 2830 3129 3389 8 861 973.4 114 130 
16 292 3131 13433 3099 9 — 104 1184 1375 153 
17 — 322) 332 зиз 406 10 123 Мм 1612 184 
18 — 3580 3742 4000 4330 п 1492 > 1604 1857 2002 
19 3848 4060. 4388 4660 12 1120 183 210 2326 
20 — 4106 438 419 5005 13 1957 2119 2870 2507 
21 . 4499 — 4728 5061 5349 14 . 229 2374 2686 — 2873 
22 4844 504 — 541] 5 15 2456 26368 2909 3154 
98 5201 58% 65772 6061 16 2718 2005 3189 344 
24 5503 505 644 — 6430 17 2088 з 3474 3733 
25 50 6192 608 6810 18 3205 3405 3760 4031 
, 26 68668 6507 695 7204 19 3512 3756 4004 4335 
27 6797 704 1и ти 20 384 4055 4309 404 
38 — 720 — 7481 7793 8047 21 446 4 — 4081 4058 
V 29 ТЇЗ 705 828 854 22 457 4670 490 5279 
30 8081 — 8577 — 8843 09050 23 4778 . 4999 5995 5606 
24 509 — 5332 560 500 
91.1 167 3,242 814 16.58 25 6451 5675 6008 — 6281 
2 3388 48.0 79,11 115.8 26 5805 6030 6356 6631 
- 09 191 144 диз 200 27 6175 6399 6721 692 
e 225.5 275,4 363.0 453.0 28 6562 — 0784 7101 7364 
365,4 432.1 5452 657.8 29 602 7189. 7498 7752 
6 526.1 608.5 745.2 878.2 30 7412 7623 7919 8161 
1 70.0 80.1 959.4 ип 31 7808 809 8378 8002 
9106 


INCOMPLETE BETA FUNCTIONS 


TABLE 1—(ront.) 


N X  .05  .00  .05 .050 N 
33 1 1.510 3.045 7.600 15.53 
2 3140 4561 7426 108.8 
3 161 1354 191.5 252.4 
4 21.3 258.2 340.8 448 
5 342.2 4047 510.9 616.6 
6 492.3 509.6 6979 8228 
7 683 749.4 898.0 100 
8 87.8 94.7 1109 18 
9 109 145 1380 1508 
10 1231 19% 159 176 
п 142 1580 1790 19% 
12 1002 180 2040 2250 
13 180 207 2291 2511 

ч 2127 2208 248 278 35 
15 2871 255 21 309 
16 2022 2804 3080 3326 
17 2881 3070 3355 3008 
18 — 314 3342 306 3894 
19 — 3419 302 392 485 
20 3700 3007 4214 4482 
21 3900 4200 45138 41И 
22 4287 4501 4818 5002 
23 453 4800 5129 5405 
2 4907 5126 — 5448 — 5724 
25 5231 — 5452 5774 6090 
26 5566 5787 609 6389 
27 5013 604 бм 6724 
28 6274 644 6810 7075 
20 6053 6870 7180 748 
30 7053 7265 797 1815 
31 7482 7088 7077 8218 
32 7955 8152 8424 802 
33 8517 8007 8042 912 
ЕЛ 1 1474 2.956 7.44 15.07 
2 30.85 44.25 72.05 1055 
3 101.9 1313 1858 244.8 
4 204.8 250.8» 330.0 412.0 
5 3316 3923 405$ 507.8 
6 411.0 552.0 6164 — 797.6 
7 6.7 726.0 802 1008 
8 813 91.1 105 1228 
9 99.1 110 1% 1456 
10 1190 1315 1510 — 1091 
n 1395 159 179 1932 
12 1607 1751 105 2179 
13 1828 1980 907 2431 
м 206 227 2405 2688 
15 2291 240 219 2951 

18 2588 209 2978 3718 36 
17 2782 2965 3243 3% 
18 3088 32207 3513 3766 
19 3300 3495 579 407 
20 350 370 400 4332 
21 3847 4051 4357 4% 
22 4131 4339 440 4918 
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35 TABLE 1—(cont.) 7 
N x 005 .010 .025 .050 N x 005 .010 „025 .050 
8 763.0 858.1 1012 1157 27 5077 5284 5588 5859 
9 963.3 1043 1212 1371 28 5368 5576 5880 6140 
10 1119 1236 1420 1591 29 5607 5876 6179 6436 | 
11 1310 1436 1635 1818 30 5975 6183 6485 6739 | 
12 1509 1044 1856 2049 31 6294 6501 6799 1048 
13 1114 1859 2082 2285 32 6626 6830 7128 7367 
14 1927 2079 2314 25% 33 6972 7172 7548 7695 
15 2146 2300 2551 2772 34 7337 7532 7809 8036 
16 2371 2539 2793 3022 35 7727 7916 8181 8395 
17 2003 277 3040 3275 36 8158 8336 8584 8781 
18 2841 3021 3202 3533 87 8666 8830 9151 9222 
19 3085 3270 3549 3795 
20 3334 3524 3810 4061 38 1.319 2.644 6.660 13.49 


28 5566 577 6085 6347 9 8833 0983.8 1144 1295 
29 5880 0001 6808 6057 10 1055 1168 1340 1503 
30 6206 6416 6719 6973 п 1235 1354 1542 1716 
31 0544 — 0752 700 — 7299 12 1421 150 — 1750 — 19M 
32 808 702 794 7636 13 1005 1751 1903 2156 
8 тп ип 7753 705 М 184 1958 2181 2383 
34 7601 7808 8134 8353 5 — 209 21721 244 и 
35 811 — 824 8547 8749 16 2230 299 2631 388 
36 8631 8799 00% 9202 17 2447 262 282 3087 
18 2669 2840 3098 3329 
37 1 1.355 2.7100 6.840 13.85 19 2800 202 398 354 
2 28.932 40.00 66.15 96.89 20 3128 3309 3582 3822 
8 9.4 1204 1704: 224.6 21 3305 3551 280 14075 
4 187.7 22.4 302.5 377.8 22 3608 3708 4% 431 
5 303.6 3502 453.7 547.9 23 3867 4050 4339 4501 
6 4360.3 5051 619.3 730.6 24 410 407 4599 — 4854 
7 582.0 00.8 79.2 93.2 25 4370 469 4865 5121 
8 740.0 833.4 98.7 12 26 9635 4837 5135 5392 
9 99.1 102 177 133 27 4908 510 500 5667 
10 106 19 1379 156 28 5185 5300 560 5047 
и 1271 — 1894 1587 1765 29 5470 5606 59% 86231 
12 144 155 180 190 30 56 5070 62 6521 
13 1663 — 1803 2021 2219 31 6007 6211 6568 86817 
м 1860 — 2017 226 2453 32 63709. 6582 6875 7170 
15 2081 2% 205 2691 38 673 6004 71920 7401 
10 — 2200 мй 2710 2933 34 742 709 70 ТМ 
17 — 2522 201 2949 3178 35 7399 — 7591 7800 8084 
18 — 2752 2077 319 зв 36 Ты 7906 8225 8435 
19 2087 23168 244) 61 37 802 837 8619 881 
20 — 3227 виз 3600 3038 38 8000 8859 905 — 9202 
M 21 ми — 304 — 399 4190 
7? 96% 390 420 444 8 1 1285 2.577 6.40 13.14 
23 3988 4181 — 4476 — 4732 2 2685 3851 62.72 91.88 
2 447 4M8 — 4746 005 З 8854 1.1 161.5 212.9 
+ 4517 471 502 бз 4 1778 217.3 286.6 3580 
498 — 4999 502 5503 5 287.4 340.1 429.7 519.0 


INCOMPLETE BETA FUNCTIONS 
TABLE 1—(cont.) 
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N x 005 -010 .025 -050 N x .005 -010 +025 +050 
6 429 478.0 586.2 601.9 22 33% 3577 380 4088 
7 551.3 628.0 753.5 874.0 23 3627 3812 4090 4332 
8 700.5 788.1 0929.6 1064 24 3863 4052 4333 4578 
9 859.0 956.9 1113 1260 25 4104 4296 4581 4828 

10 1026 1133 1304 1462 26 4350 4544 4832 5081 
п 1200 1317 150 1609 27 4600 4797 5087 5337 
12 1381 1506 1702 1881 28 4857 5055 5847 5597 
18 1569 1702 1909 2097 29 5119 5319 5611 5861 
1 1762 1903 2121 2317 30 5388 5588 5880 6129 
15 1961 2109 2337 2541 31 5668 5863 6155 6402 
16 2165 2320 2557 2709 82 5946 6146 6435 6080 
17 2375 2536 2781 3000 33 6237 6436 6722 6903 
18 2590 2757 3010 3235 34 6537 6734 7017 7203 
19 2810 2982 3242 8473 85 6849 7043 7320 , 750 
20 3034 3211 3478 3714 36 7174 7364 7634 7856 
21 3264 3446 3718 3958 37 7516 7701 7961 8174 
22 3499 8685 3963 4206 38 7882 8060 8308 8509 
23 3738 3928 4211 4457 39 8285 8453 8684 8868 
24 3982 4175 4462 4712 40 8759 8913 9119 9278 
25 4232 4428 4718 4970 
26 4487 4686 4979 5232 4 1 1.223 2.451 6.178 12.50 
27 4748 4949 5243 5497 2 25,52 36.62 59.68 87.36 
28 5015 5217 5513 5766 3 84.13 108.4 153.5 202,4 
29 5289 5491 5787 6040 4 168.8 206.4 272.3 340,2 
80 5569 5772 6067 6318 5 272.8 8229 408.1 493.0 
81 5857 6060 6354 6602 6 391.8 453.7 5566 657.0 
32 6154 6355 6047 6892 7 522.0 505.8 715.2 829.8 
33 6460 6660 6947 7188 8 604.2 747.5 882.1 1010 
34 6778 6975 7257 7492 9 814.3 — 907.8 1056 11% 
85 7110 7303 7578 7805 10 972.1 1074 1236 1387 
36 7459 7647 7913 8130 11 1137 1248 1422 1583 
37 7833 8014 8268 8473 12 1308 1427 1613 1784 
38 8245 8416 8652 8841 13 1485 1611 1808 1988 
39 8730 8886 9098 9261 14 1667 1801 2008 2196 
15 1855 1995 2212 2408 

40 1 1.253 2.512 6.327 12.81 16 2047 2194 2420 2623 
2 26.17 37.54 61.14 89.57 17 2244 297 262 2841 
8 86.28. 11.2 , 157.4 207.5 18 2447 2004 2847 3062 
4 173.2 211.7 279.8 348.8 19 2652 2816 3006 3287 
5 279.0 3313 418.6 505.7 20 29% 202 2288 354 
6 4021 4655 571.0 674.0 21 300 90 34 — 3744 
7 536.1 611.5 733.8 851.3 22 3208 3006 3748 3077 
8 ~ 681.9 767.3 905.2 1036 23 3522 3708 3076 4214 
9 836.1 931.4 104 127 24 350 305 — 4212 — 4453 

10 998.3 из 1209 ми 25 3088 ап — 4451 4004 
11 1168 1281 1460 1625 26 4220 4411 4694 4939 
12 134 1400 1056 1831 27 4402 4655 41 5187 
13 1526 1655 1857 201 28 4709 4904 5192 5438 
ч 173 1850 2003 2255 20 4902 558 5447 5603 
15 1900 2051 203 243 30 519 517 5706 5952 
16 204 3956 2486 — 2004 31 ыз 51 5970 6215 
17 — 2307 2465. 2704 2018 32 ым 5051 — 0299, — GABE 
18 2516 200 2027 346 33 бй 6228 6513 6754 
19 2 207 3151 3377 34 6817 6512 6794 17082 
20 2 319 3380 261 35 600 6805 7083 1315 
21 3169 3346 3613 388 36 6917 7108 7880 7605 
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TABLE 1—(coni.) 


N x 005 010 025 .%0 
37 726 702 7687 7005 
38 7971 1152 8008 8216 
30 7029 — 8104 847 виз 
40 8323 508 8715 — 8894 
41 8788 8958 940 9205 

42 1 1.193 23% 60% 12.2 

3 2401 35.74 58.20 85.97 
3 82.00 1058 149.8 1975 
4 1047 201.3 26.6 331.9 
5 261 315.0 308.1 481.0 
6 382.0 424 549.8 040.9 
7 5098 581.0 6974 809.3 
8 647.4 728.7 860.1 9847 
9 79.8 8843 1080 1166 
0 947.2 1047 1205 1353 
п 1107 1210 196 — 154 
13 174 1390 1572 179 
13 148 1570 1702 1938 
м 1624 1754 1057 2141 
15 1806 1049 — 2155 27 
16 1001 206 2857 25% 
п 248 2334 2563 2708 
18 2381 295 27% 2083 
19 2581 2740 394 320 
20 2785 2950 3200 3422 
21 294 — 3104 200 3906 
22 3207 3381 3642 3872 
23 3424 3602 3808 4102 
24 — 3045 — 3820 407 4333 
25 9870 4054 4329 4568 
26 4099 4286 4504 4806 
27 4333 4522 4803 5046 
28 4571 4762 5045 5289 
29 4814 5007 5292 5536 
30 5062 5257 5542 5786 
31 5316 5511 5796 6039 
32 555 570 6055 6207 
33 5841 6036 6319 6559 
34 608 6807 6588 6825 
35 6394 — 6580 — 6804 7097 
36 6683 6873 7146 7374 
37 6982 7169 7437 7658 
38 7295 708 758 7053 
39 70% 7800 8052 82% 
10 юми — 85 ви — 8578 
41 8360 8521 8743 8919 
42 8815 802 9159 9312 
S 1 11% 297 5898 по 
2 2492 3.00 М8: 6327 
к АР 80.14. 103.3 146,3 192.8 
4 1007 19.5 250.3 324.0 
5 259.7 807.4 388,5 469.5 
8 27.8 4317 598 655 
7 4973 500.8 


44 


x 005 010 025 050 
colo sol арб 

8 631.5 — 710.8 — 839.1 060.9 
9 773.9 862.5 1004 1138 
10 923.6 1021 1176 1320 
11 1080 1185 1352 1506 
12 1242 1355 1533 1696 
13 1409 1530 1718 1890 
ч 1582 1710 1908 2088 
15 1759 1894 2101 2288 
16 1941 2081 2297 2492 
17 2127 2273 2497 2698 
18 2318 2409 2701 2907 
19 2513 2669 2908 8120 
20 2711 2878 8118 3335 
21 2912 3080 3331 3558 
22 3120 3291 3547 3773 
23 3331 3505 3766 3996 
24 8545 3723 3988 4221 
25 3763 3944 4213 449 
26 3984 4108 4442 4679 
27 4210 4396 4673 4912 
28 4441 4629 4908 5148 
29 4076 4806 5146 5387 
30 4915 5106 5388 5629 
31 5159 5351 5633 5874 
32 5408 5601 5883 6123 
33 5663 5856 6137 6375 
34 5924 6116 6396 6632 
35 6192 6383 6660 0893. 
36 6467 6657 6930 7159 
87 6751 6938 7207 7431 
38 7045 7229 7492 7709 
39 7351 7531 7187 7996 
40 7673 7848 8094 8294 
41 8018 8185 8419 8607 
42 8396 8554 8771 8944 
43 8841 8984 9178 9327 
1 1.139 2.284 5.752 11.65 
2 23.77 34.09 55.53 81,36 
3 78.28 — 100.9 1420 188.4 
4 157.0 191.9 253.3 316.6 
5 253.6 300.2 379.4 458.6 
в 363.9 421.5 517.3 610.9 
7 485.5 — 553.3 6644 771.8 
8 616.4 693.8 — 819.2 038.2 
9 755.2 841.8 — 980.4 ии 
10 901.2 996.3 1147 1288 
n 1053 1157 1319 1470 
12 1211 1322 114% 1655 
13 1375 1492 1676 1845 
м 1543 1667 1861 2037 
15 1715 1847 2049 2233 
16 1892 2030 2241 2431 
17 2073 2216 2436 2692 
18 2259 2406 2034 2836 


рете 


————————— 4. 


INCOMPLETE BETA FUNOTIONS 
TABLE 1—(cont.) 


45 


.025 „0% N X  .05 оо 0% 09 
3039 3252 31 4878 5000 5995 5571 
327 3464 32 50 528 550 5804 
3457 3678 33 532 5530 58068 000 
3070 3895 34 5088 5771 6046 6980 
3886 — 44 35 0820 6018 — 600 6523 
4104 — 4335 36 6081 6209 6540 6770 
4325 — 4559 37 680 65 ви 7021 
4550 4786 38 6 6790 7055 7276 
4778 505 39 6879 7081 6 757 
5008 5247 40 7002 7341 705 1805 
542 5481 4i — 7457 тий 7878 8080 
5480 5718 42 7768 796 8173 33% 
5721 5959 43 68000 8260 8485 7 8606 
5066 6203 и 8400 800 8828 8080 
6216 6451 45 8880 90 9018 09856 
6470 6702 
6729 0958 4 1 100 215 5.502 11.14 
6904 7219 2 2272 32.0 5.10 77.80 
7265 7485 3 74.81 96.45 136.6 180.1 
7545 1758 4 150.0 183.4 242.0 302.5 
7833 8039 5 202 286.7 302,5 438.2 
8135 831 6 347.5 402.5 4911 588.6 
8453 8038 7 434 508,2 634.5 76.7 
878 90867 8 581 602.2 782.0 895.9 
9196 9342 9 72.4 803.2 995.7 1001 
10 850.4 950.3 1005 1280 
5.625 11.39 п 104 10 1259 — 140 
54.29 79.54 12 пй 0 107 150 
139.6 184.2 13 1810 1423 1599 1700 
247.5 309.3 14 1069 18 174 108 
370.8 482 15 16% 1750 1954 22 
505.4 506.9 16 1801 109 206 238 
649.1 753.6 17 1973 210 — 2920 — 2509 
800.2 916.6 18 — 219 — 2201 200 — 2708 
957.5 105 19 208 208 2700 2000 
1121 — 1208 Ж % 26 2804 30% 
1288 14% 21 7 2854 200 9209 
1460 1617 23 2887 3048 3280 — 3502 
1637 1801 23 3080 324 — 3491 3708 
1817 1989 24 зв 344 305 — 3016 
2000 2180 25 М5 300 302 4120 
2180 — 2373 20 3678 380 401 4388 
2375 2560 27 3884 4001 4323 4552 
2508 2768 28 40% 4070 4598 4708 
2705 2909 39 406 40 400 4987 
2964 3173 30 4% 405 4070 5208 
3166 3280 31 4 4907 508 5492 
3871 3588 33 407 502 5424 5658 
3578 3790 33 8105 582 504 5887 
878 4012 34 508 5605 5887 619 
401 4% 35 5666 582 63 634 
4910 446 38 юз 8604 Og 00 
4434 46% зт 6% 6341 60 68и 
4655 4888 38 600 054 0858 7080 
4878 5113 39 бп 683 703 1881 
5105 534 40 650 709 T7074 17587 
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TABLE 1—(cont.) 


x 005 010 025 .050 N x 005 .010 .025 . 050. 
41 7218 7393 7643 7850 3 71.04 92.36 130.8 1725 
42 1507 7679 7921 8120 4 143.6 175.5 231.7 289.7 
43 7812 7978 8210 8400 5 231.7 2744 347.0 419.5 
44 8137 8295 8516 8694 6 $32.4 385.1 472.8 5856 
45 8493 8642 8847 9010 7 443.2 505.3 607.1 705.0 
46 8912 9047 9229 9370 8 562.3 033.3 748.1 857.3 
9 688.7 707.9 895.0 1015 
1 1.000 2.138 5.385 10.01 10 821.3 908.4 1047 1176 
2 22.23 31.0 51.90 76.13 и 959.5 1054 1203 1342 
3 73.19 — 94.36 133.6 176.2 12 1103 1204 1364 1510 
4 146.7 179.4 236.8 205.9 13 1251 1359 1528 1683 
5 236.8 280.4 — 354.0 1428.6 м 1403 1517 1695 1857 
6 339.8 — 393.0 483.2 570.8 15 1559 1079 1866 2035 
7 453.1 516.5 6205 720.5 16 1719 1845 2040 2215 
8 574.9 — 647.4 — 704.7 876.2 17 1883 2014 2216 2398 
9 704.2 785.2 914.9 1037 18 2050 2186 2395 2583 
10 8399 028.9 — 1070 1202 19 22 — 23001 2577 270 
и 981.3 1078 1230 1372 20 2394 2539 2761 2959 
12 1128 1232 1395 1544 21 2571 2721 2948 3150 
13 1280 130 152 1720 22 2751 — 2904 — 3198 304 
14 1435 1552 1734 1899 23 2934 3091 3329 3540 
15 1595 1718 1909 2081 24 3119 3280 3523 3737 
16 1759 1888 2087 2265 25 3307 3472 3719 3936 
17 1926 2061 2268 2452 26 3199 3007 3918 4137 
18 2097 2237 2451 2042 27 3694 3864 4119 4340 
19 2272 2417 2637 2833 28 3891 4005 4322 4546 
20 2451 2600 2826 3027 29 4092 4207 4527 4753 
21 2633 2786 3018 3224 30 4296 4473 4735 4962 
22 2818 2974 3212 342 31 4503 4082 4946 5174 
23 3005 2166 3408 3622 32 4714 484 5150 5387 
24 3195 3360 3607 3824 33 4928 5109 5375 5603 
25 3389 3557 3808 4029 34 5145 5327 5593 5821 
26 3587 3757 4012 4235 35 5366 5549 5815 6043 
27 8787 3960 4218 444 36 5592 5774 6040 6267 
28 3990 4166 4427 4654 37 5822 6004 6269 6493 
29 4196 4375 4639 4867 38 6057 6238 6501 6723 
30 4406 45856 4853 5082 39 6200 607 6737 606 
81 4620 4801 5069 5299 40 2 6721 6978 7198 
32 4887 5020 5288 559 41 674 — 6971 7224 7435 
33 5057 5242 5510 5741 42 705. 8 7476 7681 
34 5283 5467 5736 я irs 

БО: ЫШ, ы tie 5966 43 702 таз 774 1794 
86 5745 pes qi 6194 44 7602 7768 8002 8194 
37 ам e 6425 45 7896 8056 8280 8364 
зі po ы 6659 46 8200 8302 8575 8746 
FI case а сш 47 852 8095 8893 ed 

2 6734 6013 zi 257 48 8955 9085 9260 93 
6998. 7174 7426 7635 4% 1 1.028 2.051 5.166 10.46 
42 7271 7444 7690 7893 2 21.32 30.58 40,52 13.01 
43 756 774 700 818 3 70.15 90.45 128.1 168.9 
“ 7855 — 8018 — 8245 — 8452 0 1 i 6 

45 8173 8320 4 140.6 17.9 226.9 283. 
a asa во 8546 — 8721 5 226.9 2686 339.7 410.8 
d dui 8871 9030 6 325.4 377.0 4620 546.9 
9007 0245 9383 7 433.7 494.5 504.2 — 690.2 
Ё ў 2 9.2 

1 1.044 — 2.094 5.273 10.68 pus Mr a s ў 
21.77 31.28 50.87 74.54 10 803.5 888.8 1025 1151 

ра аза ee 


INCOMPLETE BETA FUNCTIONS 
TABLE 1—(cont.) 


N х 05 00  .05 .050 X —.05  .010 .05  .050 
Шш 086 1081 1177 вв 6 318.6 360.2 458.4 535.7 
12 1079 пз 134 из 7 ‚447 4843 581.0 — 676.0 
13 1223 1329 1495 1646 8 538.8 6068 717.0 821.8 
14 — 1972 Мм 1658 1817 9 659.6 785.7 857.6 0725 
15 — 1524 1048 1825 1991 10 786.5 870.0 10% 117 
16 160 1804 195 2168 11 986 109 1158 1286 
17 1840 1969 2167 2346 20 1055 пз 1900 147 
18 2003 2157 2342 2526 19 п? 1801 1469 1612 
19 2109 208 2520 2709 M 1340 142 1623 1779 
20 2939 2482 2700 — 2804 15 1401 1007 1786 109 
21 — 2511 2659 2880 3081 16 105 16 1952 2121 
2 266 298 3007 3270 17 100 106 200 2205 
23 2805 3020 3254 з 18 1959 2000 2291 2472 
24 — 3040 — 324 ми з 19 2 2257 2406 200 
25 382209 3391 8635 — 3848 20 2080 2427 2641 281 
26 3416 3581 3з — 404 21 2456 250 2819 3014 
27 306 373 404 422 22 2006 2775 2000 39 
28 3798 3008 4222 442 23 279 2053 382 3385 
29 303 46 4402 4н 24 — 297 392 67 3578 
30 40 4306 4 488 25 31560 3314 354 2703 
31 49% 499 4809 504 28 3338 М9 3740 3055 
32 497 4П5 5087 5262 20 3502 387 303 410 
33 4805 4% 507 540 28 3700 3877 406 434 
34 505 — 5195 — 5450 5685 29 83800 400 481 451 
35 5229 500 5074 58% 30 4090 423 4518 4799 
30 58 5% 5802 617 31 @ 440] — 4718 — 4999 
37 500 5850 613 6337 32 б 4060 — 490 — 512 
38 5800 6076 0338 6550 8 4088 4804 — 5124 5347 
39 6127 6300 656 6785 и 4809 500 5331 54 
40 6363 61 6708 704 35 500 508 540 57% 
41 605 — 6781 704 — 7247 36 500 5489 5752 — 5974 
42 6853 707 706 Ті 37 5% 5705 5067 6188 
43 718 709 75238 77% 38 5745 5з 68 — 0404 
44 той 1700 777 ТИ 39 507 045 — 0404 6022 
45 — 7647 180 8040 8229 40 6105 6373 600 (084 
40 795 802 8313 803 а М8 6008 6% 7060 
47 8240 892 800 8770 42 605 0839 700 7208 
48 8580 87217 8915 9068 43 6000 701 706 751 
40 8975 о 9275 9407 44 740 7329 — 7509 — 7709 

45 742) 7585 7810 8012 

50 1 1.002 2.00 5.062 10.25 46 7000 785 807 8262 

2 20.89 20.07 48.82 71.54 47 7978 8108 83456 — 8522 
3 68.73 88.61 1255 165,5 48 805 воз 800 8794 
4 137.7 168.4 222.8 27.9 49 806 8745 8055 9086 
5 222 263.1 332.7 4024 50 8005. 9120 9280 — 9418 


BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 
AND RELATED TOPICS 


I. RICHARD SAVAGE 
National Bureau of Standards 

This bibliography contains 999 references on nonparametric 
statistics and related topics, classified as follows: (A) Surveys 
and Discussions (39), (B) Theory (31), (C) Tchebycheff In- 
equalities (94), (D) Tolerance Sets (21), (E) Goodness of 
Fit (122), (F) Multisample Problems (53), (G) Parameter 
Problems (135), (H) Contingeney Tables (75), (I) Randomness 
(109), (J) Correlation and Curve Fitting (96), (K) Compara- 
tive Studies (49), (L) Systematic Statistics (127), (M) Scaling 
(37), (N) Distribution Theory (383), (O) Applications (89), 
(P) Tables (238), (X) Miscellaneous (28). 


INTRODUCTION 


ONPARAMETRIC statistics has recently become an important special 
| \ field of statistics, Papers related to nonparametric problems were 
published in the nineteenth century, but the true beginning of the sub- 
ject may be taken as 1936, the year in which Hotelling and Pabst pub- 
lished their paper on rank correlation. By 1943 the literature had be- 
come extensive enough to warrant a review article by Scheffé. Now the 
number of papers concerned with nonparametrie statistics appearing 
in statistical journals is very large; in fact these articles are taking 8 
considerable portion of the available space. Consequently supplements 
to this bibliography will be issued in order to keep it up to date. 

In spite of the abundance of nonparametric literature, there is no 
generally accepted definition of the field. It is not always clear which 
techniques, problems, and theories are nonparametric. In the prepara- 
tion of this bibliography over-inclusion hag been deemed better than 
omission of titles that might be of use to those interested in border-line 
aspects of nonparametric statistics, 

Entries in the bibliography are arranged alphabetically by author, 
and chronologically within authors. After each entry one or several let- 


ters appear, indicating the categories in the following list to which the 
entry belongs: 


CEN Surveys and Discussions 
. Theory 

. Tehebycheff Inequalities 
Tolerance Sets 


пош» 
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. Goodness of Fit 

. Multisample Problems 

. Parameter Problems 

. Contingency Tables 
Randomness 
Correlation and Curve Fitting 

. Comparative Studies 

. Systematic Statistics 

. Scaling 

. Distribution Theory 

. Applications 

Tables 

. Miscellaneous 


MuougEPmuWHHmHmEQtE 


Following many of the entries appears a sequence of digits of the form 
xy-abc; this means that the entry was reviewed in Mathematical Re- 
views in the year 19ту оп page abc. 


A. Surveys and Discussions 


Since nonparametric statistics has existed as a special field only about 
fifteen years, surveys and discussions of the general subject are scarce. 
Two important papers of this type are by Scheffé [1943b] and Wolfo- 
witz [1949]. These papers give a comprehensive view of the problems 
and results obtained up to their time of publication. Pitman [1948] and 
Hemelrijk et al. [1951] have sets of lecture notes devoted to nonpara- 
metric statistics. Wilks [1948] covered many of the problems of non- 
parametric statistics in his discussion of order statistics. Wallis [1952] 
gave a brief introduction to the subject and its applications. Most of 
the other papers given'this classification are specialized and their cross 
classifications will better indicate their content. 


B. Theory 

There does not exist a unified theory of nonparametric statistics, 
but there have appeared theoretical approaches to some of the special- 
ized problems. The structure of critical regions with optimum proper- 
ties was discussed by Feller [1938], Scheffé [19432], ose and Stein 
[1949], Hoeffding [1951b], and Lehmann [1951]. The use of “maximum , 
likelihood” in the nonparametric theory was introduced by Wolfowitz 
[1942]; Levene [1952] made further use of this concept. Hodges and 
Lehmann [1950] gave some nonparametric estimators which are opti- 
mum in terms of minimax theory. 
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C. Tchebycheff Inequalities 


Tchebycheff inequalities are included in the bibliography since they 
allow one to make probability statements when only a small amount 
of a priori information (usually several moments) is given about the 
distributions involved. Fréchet [1950] presented a review which included 
many of the inequalities of the Tchebycheff type. Godwin [1944] gave 
an English summary of the Fréchet material Guttman [1948b] and 
Midzuno [1950] introduced inequalities using higher sample moments, 
which gave shorter average confidence intervals than the usual in- 
equalities. 


D. Tolerance Sets 


Wilks [1941] gave the first presentation of nonparametric tolerance 
limits. There have been subsequent generalizations of the theory of 
tolerance limits treating multivariate samples, irregularly shaped re- 
gions, discontinuous cases, and sequential cases. A recent paper in this 
field is by Fraser [1953a]. 


Е. Goodness of Fit 


A goodness of fit test has the following properties: (1) It is defined 
for samples from some large class of distributions, such as all continuous 
univariate distributions. (2) The null hypothesis is either some specified 
distribution or a class of distributions of which the functional form is 
known. (3) For all null hypotheses the test statistic used has the same 
distribution (at least in the limit). (4) The test is consistent. 

The first goodness of fit test, chi-square, was introduced by K. Pear- 
son [1900]. Since then many new procedures have been presented. Cur- 
rently, there is much interest in the Kolmogorov-Smirnov tests and 
related topics. These have been summarized by Anderson and Darling 
[1952]. Some progress has been made in devising goodness of fit tests 
for multivariate problems; see papers by P. B. Simpson [1951] and 
Rosenblatt [1952a]. There has been little justification for the proposed 
goodness of fit procedures, but Neyman [1937], Mann and Wald [1942], 
and Wolfowitz [1942] gave procedures having optimum properties 
other than consistency. 

F. Multisample Problems 


Multisample problems or multisample goodness of fit problems in- 
volve testing the hypothesis that several samples come from the same 
population. Solutions to these problems should satisfy the following 
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conditions: (1) Under the null hypothesis the test statistic is distribu- 
tion-free (at least in the limit). (2) The procedure is consistent for a 
large class of alternatives. i 

An early approach to this problem was K. Pearson’s [1911] two sam- 
ple chi-square test. Since then, many procedures have been introduced. 
Recently, there have been many investigations of tests related to the 
Kolmogorov-Smirnov test; an example is the work of Anderson and 
Darling [1952]. Tests having optimum properties other than consist- 
ency were suggested by Wolfowitz [1942] and Lehmann [1951]. 


G. Parameter Problems 


Parameter problems include estimation and testing procedures deal- 
ing with location and scale parameters. Although these problems in- 
volve parameters, they are nonparametric since (1) the parameters are 
defined for large classes of distributions, and (2) the proposed proce- 
dures lead to probability statements that are distribution-free. 

А typical parameter problem is the testing of the hypothesis that а 
sample comes from a distribution with some specified percentage point. 
This problem has received extensive treatment by K. R. Nair [1940b], 
Steward [1941], Dixon and Mood [1946], Noether [1948, 1951] and 
Walsh [1949c, 1951a]. Testing the hypothesis that two samples come 
from populations with the same median has been examined by many 
authors beginning with the work of Wilcoxon [1945] and summarized 
by Kruskal and Wallis [1952]. Nonparametric analysis of variance has 
been discussed by Pitman [1937c], Friedman [1937], С. W. Brown and 
Mood [1951], and Terry and Bradley [1952c]. 7 

Optimum procedures have been investigated by Wolfowitz [1942], 
Lehmann and Stein [1949], Hodges and Lehmann [1950], and Hoeffding 
[19515]. 8 


H. Contingency Tables 

Contingency tables are the conventional rXs tables used to cross- 
classify data. Techniques using contingency tables include the analysis 
of association and tests of goodness of fit. Many of these techniques are 
distribution-free, since they are based on conditional distributions of 
the sample. 

An interesting theoretical m 
made by Fisher [1948]. Papers by і 
[1947а, 19476) inns a survey of the field. Mainland [1 
extensive applications and tables. 


atment of contingency tables was | 
E. S. Pearson [1947] and Bernard 
948] gave 


+ 
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I. Randomness 


In many situations it is desirable to examine the assumption of 
randomness. This involves testing the hypothesis that а sequence of 
observations was made on independently and identically distributed 
random variables. Wald and Wolfowitz [1943] and Levene [1952] pre- 
sented many procedures of this type. 


J. Correlation and Curve Fitting 


Problems of correlation and curve fitting are properly classified as 
parameter problems (G) but the wealth of literature on this subject 
justifies a separate class. A fundamental paper on nonparametric 
correlation is the treatment of Spearman’s coefficient by Hotelling 
and Pabst [1936]. A small treatise on many rank correlation methods 
was prepared by Kendall [1948a]. Typical papers on curve fitting are 
i K. R. Nair and Shrivistava [1942] and K. R. Nair and Banerjee 
1942]. 


K. Comparative Studies 


The work of F. N. David and Johnson [1951a, 1951b] on the distri- 
bution of the F statistic under non-normal conditions is typical of 
the material given this classification. Emphasis is placed on finding the 
operating characteristics of “normal” statistics under non-normal con- 
ditions rather than on the development of distribution-free statistics. 
These papers are nonparametric, since they show which of the para- 
metric procedures have operating characteristics which are not strongly 


dependent on the specific parametric assumptions that were used in 
their development. 


L. Systematic Statistics 


Mosteller [1946] introduced the term “systematic statistics” when 
referring to linear functions of the order statistics of a sample. Much 
of the theory of these statistics has involved the assumption of nor- 
mality. Nevertheless, these techniques have two things in common with 
nonparametric techniques: ease of computation, and "inefficiency." 
Dixon and Massey [1951] summarized many of the uses of systematic 
statistics. 

=, M. Scaling 


Many statistical problems involve the measurement or the com- 
parison of objects where the units of measurement are not well defined, 
for instance in measuring tastes. Hence either artificial scales are de- 
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veloped or scales are avoided by using ranks. The reports by Bradley 
and Duncan [1950], and Bradley and Terry [1951a, 1951b] contain 
much information on this subject. 


N. Distribution Theory 

The development of nonparametric procedures involves many dis- 
tribution problems. Mood [1940] found many of the distributions that 
are connected with run theory. Wald and Wolfowitz [1944] developed 
limit theorems needed in the theory of statistics based on the method 
of randomization. Hoeffding and Robbins [1948] gave special central 
limit theorems useful in developing tests of randomness. 


О. Applications 

Since nonparametric statistics has developed recently there have 
been few published applications. However, most theoretical papers 
give illustrations of the methods being presented. 


P. Tables р 

Once a nonparametric technique has been developed it is often useful 
to have special tables to facilitate applications. An example is the 
tables by Swed and Eisenhart [1943], giving the distribution of runs. 


X. Miscellaneous 


In spite of the many classifications given, а few papers remain un- 
classified. Papers by Wallis [1942] on the combination of independent 
tests and Tukey [1946] on inequalities for deviations from the median 


аге typical of these miscellaneous papers. 
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ERRATA s 


Readers and authors are invited to submit corrections to papers pub- 
lished in any previous issue. These will be published each year, in the 
December issue. 


Hyrenius, Hannes, ON THE UsE or Влхакв, Cross-RANGES AND EX- 
TREMES IN COMPARING SMALL SAMPLES, Vol. 48, No. 263 (September 
1953), 534-45. 

On page 536, in equation (9b), a factor T-Y is missing. 

Furthermore, the sum in the equation can be evaluated, simplifying 
the formula to 


(Ni — (М — №! . 
T(N: + № — 1)! 


Kruskal, William H., and Wallis, W. Allen, Use or RANKS IN ONE- 
CnrrERION VanrANCE Anatysts, Vol 47, No. 267 (December 1952), 
583-021. 

1. In Section 5.3 of [a] we should have mentioned, had we known of 
it, а 1952 article by van der Reyden [b]. Van der Reyden develops Wil- 
coxon's two-sample test independently, and tabulates critical values of 
R at two-tail significance levels of 5, 2, and 1 per cent for all sample 
sizes such that 10 € N €30 and 2 or 3€n <12, the lower limit for n be- 
ing 2 at the 5 per cent level and 3 at the other levels. 

Since van der Reyden's tables for the 5 and 1 per cent levels cover 
much the same ground as White's [a, Sec. 5.3.5], we have compared the 
two tables wherever possible and have corresponded with van der Rey- 
den, who in turn has corresponded with White, about discrepancies be- 
tween them. The upshot of this correspondence is: (2) There are nu- 
merous and fairly sizeable errors in the columns for n—11 and 12 of 
the three van der Reyden tables; van der Reyden has very kindly 
sent us the corrected values, but these have not yet been published.? 
(4) In addition there are twelve scattered discrepancies, each of one 
unit in R, between the van der Reyden and the White tables at the 5 
and 1 per cent levels; in all of these van der Reyden appears to be cor- 


(9b) fne 


1 For comments embodying or leading to these corrections and additions we are indebted to К.А. 
Brownlee (University of Chicago), P. J. Rijkoort (Royal Netherlands Meteorological еч, 
L. J. Savage (University of Chicago), Т. J. Terpstra (Mathematical Center, Amsterdam), D. уап der 
Reyden (Tobacco Research Board of Southern Rhodesia), and C. White (University of Birmingham). 

2 White, in a letter to us, points out that approximately correct values for columfs 11 and 12 of 
van der Reyden’s tables may be obtained as follows: move the entries in the 1, columns down one Tow 
to find the approximately correct L values; then to obtain the corresponding U values use the relation- 


ship U=n(m+1)—L (van der Reyden's notation). This applies to all three levels of significance. 
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rect, White's values leading to probabilities of Type I error slightly 
greater than intended. 

2. Reference [44] should have been listed as shown at the end of this 
note. Had this been available to us in time we should have included the 
following description in Section 5: 
ts Rijkoort’s C-Sample Test. Rijkoort [44] has proposed the C-sample 
test which rejects when 


S= D ne[R: — (У + D] 


is large. The use of S is not equivalent to the use of H unless all т/ are 
equal; in that case the relationship is S— N*(N 4-1) H/(12C). In Rij- 
koort’s paper k is used for our C, and when all the n,’s are equal m is 
used to.denote their common value. 

Rijkoort tabulates the distribution of S for the following cases with 
all п? equal: € —3, N=6, 9, and 12; С-4, N=8; С-5, N —10. He 
also gives the upper tails of the distributions of 8 (down at least to the 
upper 5 per cent points) for C—3, N=15 and for C —4, N=12. In ad- 
dition, he gives approximate upper 5 per cent critical values of S for C 
from 3 through 10, and for (equal) n/'s from 2 through 10. True critical 
values are given in some cases. 

We have compared Rijkoort’s distributions with ours when С=З 
and have found a few discrepancies. Correspondence with Rijkoort 
about these leads to the following corrections to the cumulative prob- 
abilities in Rijkoort’s tables, (We omit corrections of only a single unit 
in the last decimal place.) 


k = 8, т = 2: S = 18, = 0.467. P should be 0.400. 

k =3,m = 2: 8 = 14, P = 0.600. P should be 0.533. 

k = 3, т = 5: 554 S S < 654. The tabulated P's are all about 0.002 
too low. 


п шшш, Rijkoort has kindly sent us the following corrections to his 
able: 


Ез4,т-2:8- 74,P = 0.040, P should be 0.038 
Е-5,т-2; S = 128, P = 0.0847. Р should be 0.0910. 
k=5,m=2: 8 = 122, P = 0.1208. P should be 0.1280. 


Finally, ih Rijkoort’s table of 5 per cent critical points the number 
pair 558-566 in column 3 and row 5 III should be 566-578. 
3. On p. 587 of [a], in the fourth line after formula (1.5), the word 
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"essentially" may be misleading. What we meant to indicate by this 
heuristic statement is that without the factor (N —1)/N, Н would be 
like a sum of squared standardized deviations in which the finite-popu- 
lation corrections to the variances and the correlation between means 
had been disregarded. The factor (N —1)/N is the net result of giving 
due regard to these two points. 

4. In footnote 6, p. 591, it should have been stated that the compari- 
son between use and nonuse of the continuity correction in the two- 
sample test pertains to the one-tail version. For the two-tail version, 
use of the continuity correction is advantageous only when the proba- 

bility is 0.04 or more. 

5. To avoid an ellipsis, the following phrase should be added on 
p. 593, in line 14, just before the semicolon: "thereby altering the value 
of H." 

6. The errors listed on the next page have been found in Table 6.1, 
most of them as a result of correspondence with T. J. Terpstra.* These 
corrections affect Figure 6.3 of [a] at а few points, but do not change 
the general patterns of deviations between true and approximate prob- 
abilities shown by Figure 6.3. 

7. We take this opportunity to call attention to a paper by Rijkoort 
and Wise [c] which has appeared since [a]. This presents new approxi- 
mations to the sampling distributions for Friedman’s test (а, Sec. 5.2] 
and for the H test [a] if all samples are of the same size (in which case 
the Н test and Rijkoort’s test [44] are equivalent). The approximations 
are based upon а series expansion for the inverse of the incomplete 
Beta integral. Nomograms facilitating use of the new approximations 
ате given (in both cases) for significance levels from 1 to 10 per cent, for 
3 to 20 samples, for sample sizes of 1 to 30. 

8. We should also like to call attention to à recent paper by van der 
Waerden [d]. In this paper the power of the Wileoxon test in the normal 
case is discussed, and an alternative nonparametric test is proposed 
that is more powerful than the Wilcoxon test іп the normal case. 


REFERENCES 


[а] Kruskal, William H., and Wallis, W. Allen, “Use of ranks in one-criterion 
variance analysis," Journal of the American Statistical Association, 47 (1952), 
583-621. : : 

[b] van der Reyden, D., “A simple statistical test," Rhodesia Agricultural 
Journal, 49 (1952), 96-104. é 

И О Gehan aha eR ожа енетін 

1 We are indebted to Jack Nadler for making the computations involved in rechecking Table 6.1. 
Almost all of the errors had occurred at one stage of the computations, and Mr. Nadler recomputed 
this stage completely. 
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CORRECTIONS TO TABLE 6.1 


"In each pair of lines, the first repeats the line from Table 6.1 with the 
erroneous entries italicized, and the second gives the corrections. 


Approximate minus true 


Sample Sizes Е True probabili мс did 
H Proba- T B 
m m т bility x: (Linear | (Normal 
Interp.) Interp.) 
АЛ ТЕДІ 3.6000 .267 — .101 —.107 —.987 
.200 -.085 —.100 -.200 
4752 072 4.8750 .100 4.012 —.020 — .002 
4.4583 +.008 — .024 — .010 
аа 6.4444 009 +.031 +.012 — .002 
008 +.032 +.013 
4: /8%2:0 6.4222 .010 +.030 +.011 — .004 
6.3000 .011 +.032 +.012 —.002 
4109.70 5.4444 047 +.019 -.005 -.010 
046 +.020 — .004 — .009 
E IR е 5.4000 052 +.016 — .008 — .013 
051 -.012 
ИЯ 4.4667 101 4-.006 — .00 — .008 
4.4444 102 +.007 —.019 -.002 
E "3.8 4.7091 .094 4.001 | —.0#1 — .006 
.092 +.003 -.019 -.004 
52180129 6.9818 010 +.020 +.008 — .002 
7.0788 009 
№155 135.3 6.8606 011 +.022 +.008 -.001 
6.9818 +.019 +.006 — .003 
51253158 5.4424 048 +.018 —.000 +.002 
5.6485 049 +.010 —.007 —.007 
5 S 3 5.3455 050 +.019 +.000 +.004 
5.5152 .051 4.013 -.005 —.004 
кеМ een ots est AE ЕРО роит 


m 7 
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[44] Rijkoort, P. J., *A generalization of Wilcoxon's test," Proceedings, Kon- 
inklijke Nederlandse Akademie van Wetenschappen, Series А, 55 (1952), 
394—404. 

[с] Rijkoort, P. J., and Wise, M. E., *Simple approximations and nomograms 
for two ranking tests," Proceedings, Koninklijke Nederlandse Akademie van 
Wetenschappen, Series A, 56 (1953), 294-302. 

[d] van der Waerden, B. L., *Order tests for the two-sample problem and their 
power," Indigationes Mathematicae, 14 (1952), 453-58. 


Rider, Paul R., Taz DISTRIBUTION or THE Propuct or RANGES IN 
SAMPLES FROM A RECTANGULAR PoPuLATION, Vol. 48, No. 263 (Sep- 
lember 1953), 546—9. 

' On page 549, in formula (13) the factor 2 should be removed from the 
denominator. 


Robson, D. S., and King, А. J., MULTIPLE SAMPLING OF ATTRIBUTES, 
Vol. 47, No. 258 (June 1952), 203-15. 

The estimate of variance, equation (6), pages 205 and 215, should 
read 


= 1ГМ – № m..om.. N-n MiMi. 
ТЕ ->| йы ЛА ТЫ уо 
MN т-і Mn | т-і 

п-т "| 

nm n T; — 1 р 


+ 


А proof of the unbiasedness of this estimator may be deduced from the 
appendix by noting, in addition, that 


E mij 57 m 
А | Mj M 


Dwyer, Paul S., and Waugh, Frederick V., ON Errors ім MATRIX IN- 
VERSION, Vol. 48, No. 262 (June 1953), 289-319. 

Dr. W. Duane Evans and Mr. John C. H. Fei have called our atten- 
tion to the need for modifying Section VII of our paper. In that Sec- 
tion of the paper we considered the inversion of а given Leontief ma- 
trix, Т, = 1 — A, where each element of A is non-negative. We assumed 
that any element of L might be in error by 100 k per cent, and we pro- 
posed a very simple upper bound to the discrepancies betweerfelements 
of the given matrix and the true matrix. Unfortunately equation (7.5) 
does not provide an upper bound to such discrepancies. 
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Ав we stated in Section VI, maximum discrepancies in all elements 
of the inverse of the Leontief matrix would occur if each error in the 
given matrix were negative with the absolute value of its bound. Maxi- 
mum discrepancies do not occur with (1—4)(I — А), since while the 
diagonal errors, —kI, would be negative, yet the non-diagonal errors, 
kA, would be positive; but with 


(—REI--XcÀA-L-KU--A)-L-kH. 


The extreme inverse matrix may be obtained by calculating (L —kH)-. 
Alternatively the extreme error of L7! can be computed from equation 
(3.4) of our paper. This becomes К 


D = k(L23HL7) [1 + (KHL) + (EHE ы... | 


with HL-!-2L-1— I. 
If the diagonal terms are not subject to error the extreme inverse is 


obtained from 7 — (14-k) А. The discrepancy formula above holds with 
H replaced by А and AL~*=L-!—I, 


Brown, J. A. C., Houthakker, H. S., and Prais, S. J., ELECTRONIC COM- 
aay IN Economic Statistics, Vol. 48, No. 263 (September 1958), 

We are indebted to George W. Thomson of the Ethyl Corporation, 
Detroit, for drawing our attention to some errors in the illustrative ex- 
ample quoted in our article. The errors are associated with the con- 
vergence of the iterative process given on p. 410. The value of #8—\5) 
there given is the value for which convergence is effected after two 
iterations, but this does no! mark the boundary for one-sided converg- 
е The latter value is 1. A similar correction should be made on p. 
$ „There is a further mistake on p. 416 at step (4) where the words “pos- 
itive” and “negative” have to be interchanged; and a corresponding 
change in the interpretation of the function letter En. 


We apologize to readers who may have been confused by these er- 
rors. 


/ 


ж 


BOOK REVIEWS ^ 


Statistics in Psychology and Education. Fourth Edition. Н. E. Garrett. New 
York: Longmans, Green and Company, 1953. Pp. xii, 460, $5.00. 


Freperic M. Lon», Educational Testing Service 


"Те fourth edition of this widely used text represents a considerable re- 
ordering and revision of the previous one. Several recent references are 
listed in footnotes to the text. Material on analysis of variance has been ex- 
panded into а separate chapter, which includes an illustrative example of 
analysis of covariance. Other new materials include a fuller treatment of 


. Fisher’s z; methods of drawing a random sample; stratified sampling; the 


fourfold point correlation; factors determining selection of tests in a battery; 
and one-tailed tests of significance. 

The first six chapters will serve as a good text for students with minimal 
mathematical aptitude who are to learn to compute such statistics as means, 
standard deviations, percentiles, normal curve areas, and Pearson correla- 
tion coefficients, For students who wish to go beyond this, a text that is 
more nearly correct and complete in its statements of logical and statistical 
inferences would be preferable, providing it is not beyond the student’s 
intellectual grasp. 

There is little occasion to take serious exception to the material in the 
first six or seven chapters. In the important section on Standards of Accuracy 
in Computation, however, the reader may reach the erroneous conclusion 
(p. 23) that а square root usually has less significant figures than (often one- 
half) the number of significant figures in the number whose square root is 
extracted. (The illustration іп the text should be corrected to show that 
А/159.5600 —12.631706 (sic) with an error of no more than .0000022.) 

A very worthwhile achievement of the fourth revision is the removal 
(primarily from chapters 8-10, dealing with sampling, standard errors, and 
testing experimental hypotheses) of the serious confusion, pervading the 
third edition, between à priori and fiducial probability. Only the last sentence 
of the final chapter escaped revision: “This correction gives the value which 
R would most probably take in the population from which our sample was 
drawn.” 

Some of the more serious of the remaining errors and misstatements, 
mostly relating to the logic of statistical inference, are listed below: As a 
criterion for general use in judging randomness of sampling it is suggested 
that “If samples are fairly consistent, therefore, they are presumably random 
unless subsequent examination reveals a common bias.” (p. 205). Also, if 
we can assume the trait to be normally distributed, then “symmetry of dis- 
tribution becomes an excellent criterion of sample adequacy.” (p. 204). 
After making a certain test of the significance of the differencé between 
means, 4... we retain the null hypothesis and conclude with confidence 
that, on the evidence, there js no real difference between Norwegians and 
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Belgians on the ‘combined зса]е’... " (p. 216). А two-tailed test “should 
always be used when, in accordance with the null hypothesis, our two groups 
have conceivably been drawn from the same population...” (p. 217). 
“Forty-two salesmen have been classified into three groups—very good, 
satisfactory, and poor—by a concensus of sales managers... how many 
of the 42 salesmen may be expected to fall in each category on the hypothesis 
of a normal distribution [may be determined from a table of normal curve 
areas) by dividing the baseline of a normal curve (taken to extend over бо) 
into З equal segments of 26 each." (р. 257). 

А misstatement about statistical technique is that “x? is not stable when 
computed from a table in which any experimental frequency is less than 
5" (p. 258). (It is the theoretical frequencies that are pertinent.) 


The chapter on The Reliability and Validity of Test Scores does not pro- 


vide as clear a discussion of the different kinds of reliability as could be 
desired. Exactly two pages in this chapter, incidentally, are devoted to a 
consideration of item analysis. 

Special favorable mention should be made of the material on Type I and 
Type II errors, and of much of the material on the phi coefficient, on one- 
tailed significance tests, on scaling, and on the multiple correlation coeffi- 
cient. 

Tn the introduction to the first edition in 1926, Woodworth indicates that 
the statistician for whom the book is intended is “he who has selected the 
scientific or practical problem... . He selects the statistical tools to be 
employed . . . [he] must have a discriminating knowledge of the kit of tools 
which the mathematician has handed him.” In the reviewer’s opinion, what- 
ever may have been the case at the time the foregoing was written, the book 
is not appropriate for today’s statistician who answers to, or today’s student 
who is to be trained to answer to, the foregoing description. The book makes 
no serious effort to specify the assumptions underlying many of the state- 
ments made. In the discussion of regression and prediction, for example, it is 
frequently asserted that the predicted value is the “most likely value” with- 
out any suggestion that normality or some other special property of the 
bivariate distribution is being assumed. Standard error formulas are given 
and their use illustrated often without indicating to the reader that 1) 
normality has been assumed, and 2) the formulas can only be safely used 
with large samples, 

The chapter on Further Methods of Correlation particularly shows the 
tendency to provide a ready recipe for the calculation of any desired statistic, 
without adequately explaining the meaning of the statistic in question. For 
example, it would be useful to point out that the point biserial correlation is 
simply the Pearson product-moment correlation that would be found if 
any two arbitrary numerical values (e.g., 0 and 1, or 7 and 19) were assigned 
to тергевећ the dichotomous variable, and the usual formula for the product- 
moment correlation were then applied. The reader who wishes to apply in 
actual work the techniques of analysis of variance and covariance, the paired 
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comparison scaling technique, or the biserial, tetrachorie, or contingency 
correlation methods should refer to some book giving a more thorough treat- 
ment than is possible in a text designed primarily for other purposes. 


Sources of Wage Information: Employer Associations. N. Arnold Tolles and 
Robert L. Raimon (New York State School of Industrial Relations at Cornell 
University, Ithaca, New York). “Cornell Studies in Industrial and Labor 
Relations,” Volume III, Spring 1952, pp. xvii, 351. Paper. $3.00. 


M. I. GERSHENSON, California Department of Industrial Relations ‘ 


a PA I of this monograph presents individual digests of most of the wage 
surveys conducted by employer associations in the United States. The 
summary descriptions of each of the 220 wage surveys conducted by 120 
employer associations include such information as starting date of the 
survey, how frequently the survey is made, what industries and areas are 
surveyed, the number of plants or companies participating, the sample 
coverage, types of data collected, and types of statistical measures used to 
summarize the information; also, types of “fringe” items, method used to 
collect the original information, and form of publication or distribution of 
the data, 

Unfortunately any ambitious listing such as this, which by the nature of 
the project requires a great deal of time, becomes out of date even before 
it goes to press. No wage figures are given and the authors stress the fact 
that the listing of an association does not necessarily imply that the reader 
may obtain any wage figures from that association. As a reference of avail- 

able source data on wages, the listings are certainly valuable but the mono- 
graph may be frustrating to those seeking specific wage rate information 
since a large number of the entries indicate that the survey results are avail- 
able only to members of the association. 

An alphabetical list of employer associations, a regional finding list, an 
industrial finding list, and a finding list of area-oriented surveys are included 
in Part I together with a technical note on definitions, procedures followed, 


and problems encountered. 

The most valuable contribution to the field is contained in Part II which 
presents a detailed analysis and appraisal of wage surveys conducted by 
employer associations. The authors assess very frankly the elements of 
strengths and weaknesses of existing survey methods and state that this may 
be helpful to employer associations contemplating the conduct of wage 
surveys or seeking to improve their present methods and also to employers 
who seek to interpret and evaluate wage information they receive, to labor 
unions seeking to appraise the validity of the wage information, obtained 
through employer association surveys, and to government analysts who may 
need standards for assessing wage information presented to them. Both 


producers and users of wage surveys will find a careful reading of Part п 
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of this monograph very worth while. It contains by far the best practical 
discussion to date оп the methodology of wage surveys. 

Among a number of weaknesses discussed are those of sampling and of 
methods of collection, It is the authors’ conclusion that the employer associa- 
tions appear to give very little attention to the selection of their wage survey 
samples. One may agree that ^the objective should be that of securing a 
representative and balanced sample", but it should be pointed out that many 
associations have no way of obtaining adequate universe data in terms of 
individual establishments for а given area or industry. 

The authors believe the accuracy of many of the surveys and the uniform- 
ity of their occupational classification are open to question. They point out 
that nine-tenths of the wage surveys of employer associations are based оп. 
mail questionnaires and that more than half of those which seek occupational 
data solicit the information in terms of mere job titles without any standard- 
ized job descriptions, Attention is directed to the very wide range in wage 
rates for individual occupations which results from such procedures, 

It is implied that in many cases greater accuracy and narrower ranges are 
obtained in surveys where the data are collected directly by field visits using 
carefully prepared occupational descriptions, the procedure used by the U. 8. 
Bureau of Labor Statistics. Unfortunately this in itself does not insure nar- 
Tow ranges, ав is demonstrated by the results of the BLS Occupational Wage 
Surveys. 

Wage surveys can certainly be improved by better methods of sampling 
and collection, but it is this reviewer's opinion that a great deal remains to 
be done in developing means of eliciting more accurate replies from re- 
spondents, Highly developed job descriptions in the hands of trained field 
agents help, but evidence indicates we must still strive to devise more effec- 
tive means of reducing response error even where we have field collection 
and job descriptions. 

The authors touch on this problem and discuss some possible solutions, 
but there is need for much additional work in this aspect of wage surveys. 


Revue de Statistique Appliquée. Volume 1, No. 1, 1953. Paris: Centre de For- 
mation des Ingénieurs et Cadres aux Applications Industrielles de la Statistique, 
Institut. de Statistique de l'Université de Paris. Pp. 103. Paper. 


"Тез new journal is the organ of the newly formed (1952) statistical center 
whose full title is given above. The objectives of the Centre are stated as 
follows by its general director, M. Georges Darmois: 


We want to make it possible for the leaders of French industry to train 
their personnel in the effective use of the statistical techniques which have 
ғо gompletely ADR Conny ta in other countries. 

on 
mate Pw e other hand to pursue research so that new problems 
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The Centre offers two types of short courses in statistics for industrial 
personnel. The first, lasting from 10 to 15 days, requires no special mathe- 
matical training. The second lasts for three weeks and is designed for engi- 
neers. The first course emphasizes the methods of statistical quality eontrol 
while the second provides а wider coverage of statistical topics. Both are 
oriented toward statistical inference. A detailed outline of contents is given 
in this first issue of the Revue, pp. 16-24. 

The Centre will also maintain а consulting service (Bureau d'Etudes du 
Centre) which will serve the firms and individual engineers who belong to 
the Centre and contribute to its financial support. The Revue, under the 
direction of M. E. Morice, has been established primarily as a liaison with 
the membership. It is to carry examples of statistical applications as well as 
news of statistical meetings and the like. 

For the first few years at least, the Revue will concentrate on discussions 
of the usefulness of statistics in different areas of business, backed up by 
numerous concrete examples. The first issue fits closely in this mold with a 
series of articles describing both general and specific applications in many 
parts of French industry. While there are articles on statistical quality con- 
trol, the applications cover a much wider field of business applications. For 
example there is a discussion of the organization of statistics with the firm, 
the application of statistics in the planning of capacity of equipment needed 
in power plants, description of an industrial experiment, and two articles on 
market research, one of which gives some very interesting data on taste- 
testing. There is an annotated bibliography describing four different books 
on statistics, all in French, which might be of interest to members of the 
Centre (pp. 97-99). 

The Revue hopes gradually to shift its emphasis from concrete applications 
to methodological articles which will be aimed at graduates of the short 
courses described above. It hopes also to publish applications of statistics 
by these graduates and by other readers, and makes an interesting appeal 
for submission of expériences malheureuses whenever a lesson can be learned 
therefrom. The Journal’s address is: 

Monsieur le Rédacteur en Chef de la «Revue de Statistique Appliquée» 
11, rue Pierre-Curie, Paris 5 ӛте, France. H.V.R. 


The Problem of Summation in Economic Science. А Methodological Study 
with Applications to Interest, Money and Cycles. Goran Nyblén. Lund Social 
Science Studies No. 4, Lund, C. W. K. Gleerup, 1951. Pp. xii, 289. 


Jonn S. Cureman, Harvard University 


mom the title of this book, one might gain the impression that it is a 
Foy on index numbers. Tt is nothing of the sort. Broadly, % is no less 
than a critique of the foundations of modern economie theory, especially the 
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theory of distribution; specifically, it is the only attempt this reviewer has 
веей to confront the theory of games with empirical data. 

Nyblén opens in Chapter I with a discussion of “the fundamental idea that 
economie variables are produced by a mechanism, which can be described 
аз а system of simultaneous equations" (pp. 5-6) and quotes as typical of а 
predominating contemporary viewpoint Samuelson's assertion that “апу 
sector of economic theory which cannot be cast into the mold of such a sys- 
iem must be regarded with suspicion as suffering from haziness" (p. 6). 

In the rest of the book Nyblén subjects this point of view to forceful 
criticism. He turns in Chapter II to a discussion of specific economic models, 
dealing first with the Leontief system and criticizing it for being “very 
‘mechanical’ in the sense that no fundamental economic decisions are explic- 
Шу tied to it...” (p. 21). It should properly be regarded, he says, as em- 
bedded in a linear programming system, which allows for choice among 
alternative production processes; but such a system can only be made deter- 
minate by a statement of social objectives (і.е., by maximization of an “ob- 
jective function”) and this implies the control of the economy by a single 
will, and therefore “can comprise no real treatment of a problem of distribu- 
tion” (p. 31). The conclusion is somewhat weakened by the subsequent 
publication of the “substitution theorem” of linear programming, but not 
invalidated.! 

Next, Nyblén discusses the marginal productivity theory of distribution, 
pointing out that marginal productivity does not determine the distributive 
shares, but only determines schedules of demand for factors and supply of 
products; distribution is then determined in Walrasian fashion by the 
interrelation of the demand and supply schedules of firms and households. 
If markets are competitive, the solution is determinate, and “the distribution 
process described is automatic, because no particular agreements of any kind 
between the decision units are needed for it to function, and it is ultra- 
harmonic, because no conflicts of interest are present and no unit has more 
influence on the price-determination than any other".(p. 41). On the other 
hand if monopolistie or imperfect markets are introduced, the system be- 
comes overdetermined; for, as Marschak has pointed out, the addition of 
sloping demand functions adds more equations to the system than unknowns, 
and these functions can therefore not be independent of one another if 
markets are to be cleared. Distribution is then left unexplained. 

From this impasse Nyblén is led to а consideration of the theory of games. 
nc notes the distinction made by von Neumann and Morgenstern between 

inessential games," in which the payoff that goes to a set of players is al- 
ways equal to the sum of the amounts those players would receive when 
acting independently, and “essential games” in which the amount received 
by a coalition always exceeds the amount that its members could obtain 
independently. This is where “summation” comes in: the proposition that 


= 
eee C. Koopmans (ed.), Activity Analysis of Production and Allocation, New York, 1951, chapters 
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“in general” a set of players can obtain more by coalescing than by acting 
independently is called the “first summation theorem.” In Nyblén’s words: 
“there is always one part of the national income the distribution of which 
can and must necessarily be settled through agreements between the members 
of society; the distribution of the national income can never be completely 
settled in an automatic and harmonious way.” (p. 77). The “generality” 
here, it should be noted, is purely formal, and Nyblén is to be criticized for 
not making sufficient distinction between the formal and the empirical. The 
fact that firms’ revenue functions are necessarily interdependent was partly 
recognized by Chamberlin, and it is curious that Nyblén includes no discus- 
sion of the former’s solution to the problem in Chapter V of the Theory of 
Monopolistic Competition. However it must be admitted that there is still 
considerable oligopolistic indeterminacy left in the general equilibrium of 
monopolistic competition, so that there is ample justification for Nyblén’s 
view that the system can be neither “automatic” nor *ultra-harmonic", 

Next, Nyblén takes issue with the assumption of transferable utility, that 
is, with the postulate of von Neumann and Morgenstern that the utility lost 
by one player or set of players is equal to the utility gained by the remaining 
set. As von Neumann and Morgenstern were forced to admit, this boils down 
to the assumption that payoffs are in monetary terms and that players maxi- 
mize the expected value of monetary returns rather than utility. Nyblén 
takes issue with transferable utility on the basis of Arrow’s proposition that 
“in general” no social welfare function can be constructed from individual 
utility functions, so no common standard of value exists; this is the “second 
summation theorem”. He goes on to state: “If such a common preference 
scale exists there can be essentially no diversity of interests at all, and the 
distribution process can constitute no problem” (p. 95). This statement is 
rather extreme, for even if individual orderings of commodities are identical, 
so that a social welfare function can be established, utility is still not trans- 
ferable, that is, interpersonal comparisons still cannot be made. Furthermore, 
there is still room for struggle over distributive shares. Thus the second sum- 
mation theorem is rather a will-o’-the-wisp. 

Once we settle for monetary payoffs, the transferability assumption still 
leaves a serious problem: the constant-sum character of the game. In order 
to deal with non-zero- (or non-constant-) sum games von Neumann and Mor- 
genstern introduced, as is well-known, a fictitious n+1-th player who re- 
ceives (or pays) the difference; however, regarding as в “patent absurdity”? 
the notion that this player can make bribes, they limited the solutions of non- 
zero-sum games to discriminatory ones in which “nature” is allowed by the 
real players to receive only a specified amount—in the extreme case, only 
what it could obtain in isolation. As a result there is little to distinguish 
this from the constant-sum game. This strikes me as a principal weakness 
of the theory, and it is reflected in Nyblén’s treatment (p. 91) iif which the 


2 John von Neumann and Oscar Morgenstern, The Theory of Games and Economic Behavior, Prince- 


ton, 1947, p. 513. 
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n real players cooperate in order to maximize their payoff from nature 
(this is the “production problem") and then fight over the spoils (the “dis- 
tribution problem"). This interpretation neglects what is surely the distinc- 
tive feature of the economic “game”: the way in which the pie is distributed 
affects the size of the pie itself. Nyblén is conscious of the artificial nature 
of this dichotomy between production and distribution, and finally confesses 
that he is “not able to point out a synthesis between the two extremes” 
(p. 128).3 As a second-best solution, he concludes that the main features 
of the free economy are best analyzed in terms of distribution theory rather 
than by production theory—by the theory of games instead of by models 
which can be expressed in terms of systems of equations. 

As a first step in applying the theory of games, Nyblén tackles the aggrega- 
tion problem (pp. 57-64). Formally, there are great difficulties in game theory 
in aggregating players into indissoluble groups. Some readers may be un- 
satisfied. (though intrigued) by his procedure of invoking *incomplete in- 
formation” and ^ ‘irrational’ socio-psychological factors” in order to justify 


We come then to the empirical part of the book. In Chapter V the author 
divides the economy into four groups: workers, farmers, entrepreneurs, and 
capitalists or rentiers (he gives them the misleading name “savers”) and at- 
tempts to explain the share of the latter in the national income. He sets 


of economic events, the first prior to 1930 and the second subsequently. 
Taking as а measure of rentiers’ relative income share the ratio of the inter- 
est rate to the general price level (p. 169), Nyblén eoncludes from the data 
that rentiers' share in national income was relatively constant before 1930 
and began to decline thereafter. As measures of “the” interest rate he chooses 


Schemes for Generalized Two-person Games,” Contributions 
W. Kuhn an t 1o the Theory of Games, Vol. IT, edited b; 
H. W. Kuhn and A. W. Tucker, Princeton, 1053, pp. 301-87. e denn 5 Y 
485-6. 
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Similarly, a rise in yields is of no temporary use to bondholders during a 
period of inflation (with consols, no use at all) unless their holdings are short- 
term. It is only in the long run that parallel fluctuations of interest rates and 
ргісен can indicate stability of interest-income. Perceiving however that 
there was, as we may grant, a marked change after 1930, Nyblén comes forth 
with the hypothesis (p. 166) that before 1930 independent central banks 
carried out policies designed to maintain stability in income shares, whereas 
after that date they lost their independence and came under the political 
control of laborers, farmers, and entrepreneurs, In the language of. game 
theory, bondholders after 1930 became the “excluded player” in a now dis- 
criminatory four-person game. Thus, concludes Nyblén, “the theory of games 
gives a theoretical structure capable of comprising such sharp changes, which 
is remarkably different from the potentialities of traditional economics” 
(p. 165). 

While Nyblén's emphasis on the political determination of distributive 
Shares is interesting, his claims for the theory of games are inadequate. 
There is nothing within the theory of games to explain the change from an 
objective to a discriminatory solution; this follows necessarily from the 
static nature of game theory. The change remains exogenous and unex- 
plained, The theory of games takes the “accepted standard of behavior” 
as given, while it is this that is mostly in need of explanation. Like a ward- 
robe which provides suits for all occasions, the theory of games can no doubt 
provide categories of solutions to fit all the possible facts; but no amount 
of study of that wardrobe will predict or even explain what its user will 
wear tomorrow. And even then the clothes are completely out of character 
with the wearer, and one cannot help feeling that they fit very uncomfortably, 

In Chapter VI Nyblén turns to the problem of international distribution 
of income. In the course of lengthy excursions into the quantity theory of 
money, the Patinkin controversy, and the purchasing power parity theory, 
he makes the following observations: that according to the quantity theory 
inflation leaves relative prices, and consequently the distribution of incomes, 
unchanged (Nyblén fails to stress the fact that this analysis is applicable 
only to a stationary economy) and that, if purchasing power parity is as- 
sumed as well, inflation leaves international distribution of income (measured 
in some currency) unchanged. It is curious that Nyblén does not discuss the 
inadequacy of such a measure of a country’s real income, even if the latter 
can be said to exist. He then asserts that the purchasing power parity and 
quantity theories were valid in some periods but not in others, and seeks a 
“theory of theories”. The latter turns out to be the theory of games with de- 
composable characteristic functions, These are constant-sum games in which 
the players are divided, say, into two sets (the sets will be countries) with the 
property that the amount that any group from one set can obtain together 
with any group of the other set is the same as the amount the twó groups can 
obtain in isolation; in other words, there is no advantage to be gained from 
inter-country coalitions. In spite of this property the solutions of this game 
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аге not necessarily decomposable, that is, it is not true “іп general" that the 
sums going to each country are constant. Thus, if the players in one country 
fail to coalesce, they may fail to get their “due”, and a transfer—called an 
“Excess”—will then take place to the other country—a tribute without 
being a bribe. The case in which the tribute is zero is said to be “exceptional” 
(pp. 214-15), but later we are told (p. 220 )that its opposite seems “excep- 
tional” ; again the distinction between the formal and the empirical is blurred, 
and no attempt is made to give this tribute any interpretive meaning. As a 
result, a factitious hypothesis emerges: that “the observations consistent 
with the purchasing power parity theory imply the prevalence of a zero 
Excess between the countries studied” (p. 223) and “the observations show- 
ing successive changes of the relations between the national incomes com- 
pared imply the presence of а non-zero Excess" (p. 224). If this were all 
there was to the hypothesis, the same objections would hold as were pre- 
viously presented; but in this case there is an additional (but hardly startling) 
observation (pp. 224-7): A non-zero Excess comes about only if some region 
is not sufficiently integrated into a coalition; futhermore, once such an Excess 
has developed, one may expect a distribution struggle to ensue, taking the 
form of competitive inflationary movements, Nyblén makes special note of 
(1) the pre-1914 period, in which the purchasing power and quantity theories 
are said to be valid (apparently in the trivial sense that both exchange rates 
and relative price levels were steady), (2) the period of the early twenties 
with violent fluctuations in terms of trade, and (3) the period after the second 
World War and before Korea, characterized by a worstening of Europe’s 
terms of trade. The latter events he blames on Europe’s lack of integration, 
and the policy recommendations follow naturally. 

Nyblén finally turns, in Chapter VII, to an analysis of business cycles. 
He criticizes econometric models depicting cycles as oscillations around an 
equilibrium derived from difference or differential equations, and after dis- 
cussing the works of Schumpeter, Wiener, Domar, and Dahmén, comes 
forth with his own novel theory of cycles. To begin with, “economic progress 
and economic crises and depressions are most intimately connected” (p. 
266) for the following reasons: an innovation, taking the form of a specific 
investment, raises capital values and lowers capital values in other spheres 
of the economy, thus changing the “objective possibilities” of the situation 
(specifically, the characteristic function of the game); there ensues “a dis- 
tribution struggle which we believe to be the essence of crises and depres- 
sions” (p. 266) since it results in bankruptcies and a “breakdown of the pric- 
ing system” (p. 262). The depression is intensified by a distribution struggle 
among major social groups (in addition to conflict among businesses) brought 
about by political upheavals (p. 269n) and only brought to an end when the 
distribution struggle has been settled. Then the revival takes place, since the 
previous iniiovations had opened up profitable opportunities that had been 
neglected during the distribution struggle. 

Nyblén’s business cycle theory impresses me as being the least artificial 
of his hypotheses, and it is noteworthy that it is also the most original and 
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least tied down to the specifications of a particular game-theoretical construc- 
tion, 

Ал interesting question which emerges from this discussion is that of deter- 
minacy. Nyblén points out that the distribution struggle is settled through 
agreements and therefore not automatic and harmonious; but if its outcome 
can be predicted at all, is it not at least automatic? The question is not one 
of prediction with certainty versus prediction with a given probability, for 
Nyblén rejected stochastic systems-of-equations systems along with the 
rest (p. 5). A partial exit to this impasse might be found in the dynamic 
character of the model, for even if a distribution struggle is settled, a lot of 
time elapses during which the struggle goes on. He admits that the outcome 
“could never be uniquely predictable” (p. 265), yet appears to be committed 
in principle to a belief in the ultimate predictability (at least in a stochastic 
sense) of socio-economic events. The answer seems to be that a theory is 
non-automatic and non-harmonic only if the phenomena it describes are not 
determinate within the economic system, but only within a wider universe; 
and in this wider universe, it appears that hypotheses cannot be expressed 
in terms of systems of equations, but must find some other, qualitative, 
expression. It has been suggested to me (by D. Ellsberg) that the kind of 
prediction that Nyblén and other game theorists may have in mind consists 
of a narrowing down of the class of possible solutions; thus one might be 
able to predict the range of a variable without any specification of a proba- 
bility distribution over that interval. 

Nyblén has done economists a service by attempting to apply the theory 
of games to the facts, but the results cannot be considered conclusive, as the 
theory is imperfect and the statistical methods are crude. Moreover the 
analysis is frequently marred by misplaced concreteness. More important, 
however, is his insistence on the role of political and social phenomena in the 
explanation of economic events. His study remains an exploration into an 
as yet little-known world. It is to be hoped that his work will stimulate others 
into seeking answers to some of the fundamental questions he raises. 


Measurement of Productivity. Organisation of European Economic Cooperation. 
Paris: 1952. (U.S. Distribution Agent: Columbia University Press). Pp. 104. 


$1.25, Paper. 
Ретев О. Sterner, University of California (Berkeley) 

Many Americans have had the opportunity of meeting with members of 
one or another of the groups of foreign visitors who have come to the United 
States under the sponsorship of the technical assistance program of Euro- 
pean Cooperation Agency. This thin monograph is a report on what was 
learned about methods of productivity measurement by the members of 
three such groups who visited the United States in 1950 to study the produc- 
tivity division of the Bureau of Labor Statistics. Each mission spent five or 
six weeks in Washington listening to lectures by BLS department heads, and 
four weeks in the field visiting industrial firms, universities, and regional 
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offices of BLS. The report seems to be largely a condensed summary of the 
notes they took. 

Americans will be interested in the report chiefly for appraising whether 

missions of this sort are worthwhile methods of communication. I offer no 
opinion on this subject. The groups felt “the visit to the United States 
proved of great value both for the discussions and exchanges of views which 
it made possible, and for the cordial relations established between the rep- 
resentatives of the Member countries," and urged that further missions be 
organized in the future. 

The material covered includes: history and organization of the BLS; uses 
of produetivity measures; problems in defining and measuring input, out- 
put, and productivity; procedures for collection of data by direct inquiry 
and from secondary sources; and appendixes containing sample question- 
naires, lists of data available, and methods for computation of indexes. 
Since the text is very short (about 30 pages), it is evident that treatment of 
each topic is brief; actually brevity approaches superficiality on most points. 
From the point of view of content, more systematic and adequate treat- 
ment of the issues is available in many publications; see, for example, the 
International Labor Office, Methods of Labour Productivity Statistics, (Ge- 
neva, 1951), or for a more technical treatment, Irving Siegel, Concepts and 
Measurement of Production and Productivity, (Washington, 1952). 

One apparent purpose of the report is to provide information to the Euro- 
pean countries from which the missions were drawn that might be useful 
to them in establishing, revising, or expanding their programs of productivity 
measurement. On most technical issues, as previously suggested, alternative 
sources will be more helpful. The report does a service, however, by warn- 
ing that BLS methods cannot be transferred without being carefully investi- 
gated and adapted. One of my colleagues tells of the time he requested his 
students in an examination to visualize themselves as top executives and 
indicate how they would solve a particular problem he then set before them. 
One paper was turned in almost immediately. The student had written “I 
would hire you as a consultant.” In much the same way one feels that the 
report implicitly urges any government considering productivity measure- 
ment to hire a BLS statistician as a consultant. This, of course, is sound ad- 
vice. 


Concepts and Measurement of Production and Productivity. Irving H. Siegel 
8 9 Н. Siegel. 
Washington: U. 8. Bureau of Labor Statistics, 1952. Pp. 108. Paper. 


Автнов L. Brora, Board of Governors of the Federal. Reserve System 


Irving Siegel has been concerned for many years with the statistical meas- 
urement of production and productivity, both as a practitioner at the WPA 
National Research Project and the Bureau of Labor Statistics and as a stu- 
dent, This study is in good part a synthesis of the views originally set forth 


"кинчи 


BOOK REVIEWS 925 
i 


by him in this JounNAL and elsewhere, It has been reproduced as a working 
paper of the National Conference on Productivity, and can be obtained from 
the BLS. 

Between the introduetion and summary are four substantive chapters, 
one dealing with concepts and three with technical matters. The most valua- 
ble material is in the technical sections. Among the subjects investigated 
are the relationships between alternative indexes of production (e.g., Paasche 
and Laspeyres) and productivity (e.g., indexes derived by relating employ- 
ment measures to value-weighted and labor requirement-weighted quantity 
indexes); the nature of aggregates; directly caleulated indexes and those de- 
rived by deflation; the relationships among indexes of gross output, net out- 
put, and materials consumption; alternative formulations of given indexes; 
coverage adjustments; and the decomposition, or “partitioning,” of changes 
in aggregates into additive contributions of various elements. 

Тһе notion of the multiplicity of legitimate measures of *production" and 
“productivity,” presented in the introduction, is properly given strong em- 
phasis. Also useful is the review of the meanings given these terms in the 
literature of economie theory, national income, and index numbers, which 
is found in the chapter on concepts. The summary chapter contains some 
interesting proposals for research. 

Questions may be raised concerning certain of the main ideas presented. 
A great deal of space is given to the “multiperiod macrotype,” a notion in- 
tended to rationalize the numerical comparisons given by indexes, The 
author observes that value theory permits only ordinal comparisons, and 
these only under highly restrictive assumptions of constancy in tastes, tech- 
nology, etc., so that the usual production and productivity indexes “do not 
have any ‘economic’ import.” The solution is to imagine something called 
the macrotype (also referred to as a “fictional creature,” a “decision maker,” 
a “mythical appraiser,” a “generalized consumer equally at home in all pe- 
riods,” and a “personification of a formula”) whose “relevant behavior is 
not ‘economic’ in the ordinary sense but is described by the specific content 
and structure of the index"—i.e., in whose eyes the index is numerically sig- ' 
nificant—and then “to judge its plausibility.” 

Apparently there are separate macrotypes for indexes of every possible 
content and structure. The author does not discuss the bases on which their 
relative “plausibilities” are to be determined, but presumably they would 
be the same as are relevant in evaluating the indexes directly. It is not clear, 
therefore, that the interjection of the macrotype greatly facilitates matters. 
The author believes that “The notion of the ‘macrotype’... dramatizes 
the value judgments that underlie numerical comparisons” (page 10) and 
“Without some such conception we should probably have to abandon at- 
tempts to measure changes in the ‘physical volume’ of the physically chang- 
ing goods of an advanced industrial society” (page 39). e 

Under a proposal for a “sub-product” approach to index construction, in- 
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dex makers are urged to classify their data in terms of sub-products of the 
vertical stages through which each end-product passes. The sub-products 
would be measured in characteristic units and assigned incremental weights. 
This procedure is advanced as preferable to making indexes from end-prod- 
uct data for various reasons, most of which can be summarized under the 
headings of greater accuracy and greater flexibility for analysis. 

The sub-product approach, in effect, is already used in the production in- . 
dexes of most countries. These indexes follow an “industry” organization, 
with industries separated from one another both horizontally and vertically. 
Thus, there usually are separate industries, and separate index components 
with incremental (value added) weights, for iron ore, pig iron, and steel. 
To extend the method further would require that the “industry” categories 
now used be refined vertically into smaller elements. Undoubtedly this could 
be done in some lines, and it would be desirable to carry it as far as possi- 
ble—particularly where inter-stage inventories are customarily accumulated, 
so that operating rates can differ from one stage to the next. However, in 
most industries as presently defined in the United States any further vertical 
refinement would mean separating successive processes within individual 
plants. With a few exceptions, the difficulties of reporting quantities and, to 
a greater extent, values, would multiply very quickly. 

The author says that the necessary “reorientation of Federal and other 
statistical reporting systems on a grand scale seems very unlikely,” but 
seems to think that it is feasible, and may be undertaken “after some dis- 
illusionment” with present measures. The feasibility of such a large-scale 
program in the foreseeable future is extremely dubious, and if this is the 
“key to substantial further progress,” substantial progress is improbable. 
But there are many keys to progress, One, for example, would be to continue 
to fill out the list of products for which reliable current output data are ров- 
sible. Significant advances have been made, but the gaps still remaining are 
important, Even apart from feasibility questions, this project might well be 
given precedence over the collection of sub-product data. Progress can also 
be made in other important ways, such as refining industries horizontally— 
that is, estimating the value added weight for a product class from census 
data for plants concentrating on that class of product. While this, too, has 
definite limits (some commodities are almost always produced in associa- 
tion with certain others) it would require only more effective exploitation 
of existing data and not collection of additional data. Also, it would be use- 
ful to supplement the usual indexes, covering successive stages and employ- 
Т“ Fb weights, with value-weighted end-product measures, con- 
коше ы ое goods and producers’ equipment, Such 
jen erve many purposes not adequately served by 

In another recommendation, “free-composition” indexes are advanced as 
зое to the “customary” chain indexes for resolving “the problem of 

continuity of product series due to changes in classification, specifica- 
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tions, and variety of goods made and reported" (page 70). Тһе discussion 
turns on only one of these points—changes in the variety of goods made, 
To handle this problem, which is largely one of “new” products, the author 
proposes writing zero’s for periods when the production of a particular com- 
modity was zero, and inventing a hypothetical price for weighting purposes 
if production was zero in the weight year, This reasonable solution to the new 
product problem has in fact already been used explicitly or implicitly in a 
number of instances. 

This alone, however, would not seem to justify condemning “the ritual of 
shifting the time base, the weights, and the product classes, and then chain- 
ing the links” as superfluous “acrobatics.” Weights are usually changed pe- 
riodically so that they may continue to be reasonably representative of cur- 
rent relationships. The resulting separate links are chained together to avoid 
breaks in the series. The question of how frequently weights should be 
changed is certainly subject to debate, as it involves а compromise between 
the gain in relevance for some comparisons resulting from recent weights 
and the difficulties of interpretation that linking introduces. The author 
does not comment directly on this subject, leaving the implication that in 
his view weights should not be changed, or, perhaps, that differently-weighted 
segments should not be linked together. Apart from weight changes, the 
linking process is often used for individual components of an index to meet 
some of the other problems the author originally lists, including changes in 
classifications used for the reported data, and changes in the variety of goods 
for which figures are reported. It is not made clear how these difficulties can 
be met by “free-composition” measures. 

Other questions may be noted. There are many references to the subject 
of “externality,” a condition which exists when an “average” lies outside 
the range of the terms being averaged. But what all the discussion is about 
is not apparent. For some of the cases where this “danger” is pointed out— 
e.g, productivity measures computed from value-weighted production in- 
dexes and labor input, indexes—the relationship is not an average at all. In 
this instance the author himself demonstrates, on page 54, that the ratio 
is equivalent to the product of an average and another term. In another 
case—that of coverage adjustments—the problem as posed would appear to 
be more accurately described as a possibly incorrect assumption rather than 
possible “externality.” 

The treatment of coverage adjustments is hardly adequate. The discus- 
sion and evaluation are confined to one of the two problems such adjust- 
ments are designed to meet, that of representing quantity changes for prod- 
ucts reported in value terms only. The existence of the other problem—that 
of eliminating from an industry measure the output of industry-type prod- 
ucts actually made elsewhere, and included in the quantity data—is merely 
mentioned in a footnote. The author observes on page 63 that while the 
adjustment rests on a specific assumption (similar average price changes 
for two sets of goods), the use of unadjusted measures also implies an as- 
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sumption (similar average quantity changes for these goods). But three 
pages later he reports with apparent approval that both the WPA and the 
BLS rejected the coverage adjustment because, among other reasons, they 
preferred not to introduce an “additional” assumption. A hypothetical ex- 
ample is used to demonstrate that adjusted indexes may yield poorer results 
than unadjusted measures in the case of new products unreported in quan- 
tity terms (page 67). Caleulation indicates, however, that with the prices 
and quantities assumed in the example the ^new" product accounts for about 
43 per cent of the industry's value of output before its growth and 25 per 
cent after. With more realistic figures adjusted indexes would usually be 
found to understate the growth of new products, but by substantially smaller 
amounts than unadjusted measures. The question of new products, inci- 
dentally, while intriguing, seems to be greatly overrated as a practical prob- 
lem, at least for the United States, in this study and elsewhere. 

Тһе author catalogues the sins of index users, and asserts that index mak- 
ers prefer comfortable tradition to “the search for promisti& new paths." 
The only basis offered for this assertion is the index makers' continued ne- 
glect of Mr. Siegel's proposals. Also, in connection with one recent innova- 
tion touched on in the text, the author's information is incorrect: “То over- 
come objections to publication raised by interested groups aware of the 
practical consequences of a few percentage points, the U. S. Bureau of Cen- 
sus has been obliged to release three differently weighted 1947 indexes, not 
merely one, for each manufacturing industry" (pages 6—7). Mr. Siegel and 
his readers will be glad to know that the decision in the 1939-1947 bench- 
mark index project to compile and publish six alternative indexes for each 
industry (under the three weighting systems, with and without coverage 
adjustments) was made in the earliest planning stage of the work, and with- 
out reference to the kind of considerations suggested. 

Emphasis has been given here to some of the more controversial aspects 
of the study, but as has been noted it contains much useful and instructive 
material. The volume comes at a time when interest in the field is high, 
with n indexes undergoing revision and new ones being developed in many 
countries. 
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